Researcher ORCID Identifier

https://orcid.org/0000-0002-9081-0968

Graduation Year

2021

Date of Submission

5-2021

Document Type

Open Access Senior Thesis

Degree Name

Bachelor of Arts

Department

Economics

Reader 1

Nishant Dass

Reader 2

Mike Izbicki

Terms of Use & License Information

Terms of Use for work posted in Scholarship@Claremont.

Rights Information

© 2021 Seungho (Samuel) Lee

Abstract

This paper attempts to quantify predictive power of social media sentiment and financial data in stock prediction by utilizing a comprehensive set of stock-related fundamental and technical variables and social media sentiments. For conducting sentiment analysis, this study employs a pretrained finBERT model that provides three different sentiment classifications and respective softmax scores. Hence, the significance of these variables is evaluated with XGBoost regression and Shapley Additive exPlanations (SHAP) frameworks. Through investigating feature importance, this study finds that statistical properties of sentiment variables provide a stronger predictive power than a weighted sentiment score and that it is possible to quantify the impact features make on so-called “black box” models.

Share

COinS