Researcher ORCID Identifier

Graduation Year


Date of Submission


Document Type

Open Access Senior Thesis

Degree Name

Bachelor of Arts



Reader 1

Nishant Dass

Reader 2

Mike Izbicki

Terms of Use & License Information

Terms of Use for work posted in Scholarship@Claremont.

Rights Information

© 2021 Seungho (Samuel) Lee


This paper attempts to quantify predictive power of social media sentiment and financial data in stock prediction by utilizing a comprehensive set of stock-related fundamental and technical variables and social media sentiments. For conducting sentiment analysis, this study employs a pretrained finBERT model that provides three different sentiment classifications and respective softmax scores. Hence, the significance of these variables is evaluated with XGBoost regression and Shapley Additive exPlanations (SHAP) frameworks. Through investigating feature importance, this study finds that statistical properties of sentiment variables provide a stronger predictive power than a weighted sentiment score and that it is possible to quantify the impact features make on so-called “black box” models.