Researcher ORCID Identifier
Open Access Senior Thesis
Bachelor of Arts
Michael C. Frank
John G. Milton
© 2021 Naiti S Bhatt
While adults recognize objects in a near-instant, infants must learn how to categorize the objects in their visual environments. Recent work has shown that egocentric head-mounted camera videos contain rich data that illuminate the infant experience (Clerkin et al., 2017; Franchak et al., 2011; Yoshida & Smith, 2008). While past work has focused on the social information in view, in this work, we aim to characterize the objects in infants’ at-home visual environments by modifying modern computer vision models for the infant view. To do so, we collected manual annotations of objects that infants seemed to be interacting within a set of frames from the SAYCam dataset, a longitudinal set of egocentric head-cam videos (Sullivan et al., 2020), and we used these to fine-tune region-based convolutional neural networks for object detection and segmentation (Lin et al., 2017; He et al., 2017). We found that objects in infant visual scenes lay on a right-skewed Zipfian distribution, with a few objects appearing many times and most objects appearing few times. This distribution affected our model fine-tuning, attempted for 10 categories, as models trained on the skewed distribution and were only able to learn a few objects well and the rest of the objects poorly. These findings and limitations help drive future work exploring infant category and language learning by elucidating the statistics of infant visual experience and tackling fine-tuning with skewed data distributions.
Bhatt, Naiti S., "Uncovering Object Categories in Infant Views" (2021). Scripps Senior Theses. 1664.
Artificial Intelligence and Robotics Commons, Cognition and Perception Commons, Data Science Commons, Developmental Psychology Commons, Longitudinal Data Analysis and Time Series Commons, Other Computer Sciences Commons