Graduation Year

2022

Date of Submission

4-2022

Document Type

Campus Only Senior Thesis

Degree Name

Bachelor of Arts

Department

Mathematics

Second Department

Computer Science

Reader 1

Mark Huber

Rights Information

2022 Varun Bopardikar

Abstract

In the artificial intelligence industry, research is needed to find innovative methods to obtain labeled training data efficiently. In this case study, we investigate one such method called weak supervision, and we pair it with computer vision to programmatically label Samba TV image data. We develop a labeling function in Python using the computer vision algorithm template matching, and we use it to label TV frames, obtained from Samba TV’s data lake, based on whether they contain the ESPN logo. We optimize for accuracy and precision, and we add in an abstinence band to increase our model’s confidence. Overall, our labeling function performs well with 99 % accuracy and zero false positives, and it paves the way for more advanced labeling functions to be developed and added to the weak supervision model.

This thesis is restricted to the Claremont Colleges current faculty, students, and staff.

Share

COinS