Graduation Year

2026

Date of Submission

12-2025

Document Type

Campus Only Senior Thesis

Degree Name

Bachelor of Arts

Department

Mathematical Sciences

Reader 1

Mark Huber

Abstract

This paper examines sentiment trends surrounding women’s college basketball athletes Caitlin Clark and Angel Reese through micro-blogging social media text during the 2023–2024 NCAA season. Two datasets of different sizes and sources were analyzed to contextualize and validate findings across platforms. The study employs both a lexicon-based approach and a machine learning predictive method. Term frequency, TF-IDF, VADER and NRC lexicons, and n-gram analysis were used to measure word level sentiment and emotional patterns. For modeling, a random forest classifier was implemented to expand beyond lexicon based insights. Results revealed clear disparities in how each athlete is discussed online: Clark consistently receives more positive sentiment, while Reese faces more polarized and negative commentary. Emotion mining further highlights differences, showing Clark is associated with joy and anticipation, whereas Reese is more frequently linked to anger and fear driven by narrative framing and media storylines. These patterns from the lexicon analysis were backed through emotion mining and modeling showing how sentiment analysis can effectively capture public perception in sports.

This thesis is restricted to the Claremont Colleges current faculty, students, and staff.

Share

COinS