Researcher ORCID Identifier
https://orcid.org/0009-0005-9594-9992
Graduation Year
2023
Date of Submission
4-2023
Document Type
Campus Only Senior Thesis
Degree Name
Bachelor of Arts
Department
Mathematical Sciences
Reader 1
Mike Izbicki
Terms of Use & License Information
Rights Information
2023 Olivia J Renfro
Abstract
Sentiment analysis is widely used in various industries, and both lexicon labeling and machine learning approaches have been extensively compared. While machine learning models have shown higher accuracy, they require manual labeling of training data, which is time-consuming and costly. This study compares the performance of eight hybrid sentiment analysis models on Twitter data using SentiWordNet and VADER polarity lexicons. Different transformation techniques were used to create a numerical feature map and fed into Linear SVM and Random Forest models. SVM with TF-IDF outperformed RF for most hybrid models, while RF performed better than SVM in almost all word-embedding variations. RF performed exceptionally well for both lexicons, even though it is less cited in sentiment analysis literature, and only SVM with TF-IDF transformations were competitive with RF. SVM consistently performed the worst with word-embedding transformations for both polarity lexicons.
Recommended Citation
Renfro, Olivia, "SentiWordNet and VADER: Comparative Analysis of the Efficacy of Hybrid Sentiment Analysis Models" (2023). CMC Senior Theses. 3317.
https://scholarship.claremont.edu/cmc_theses/3317
Data Repository Link
https://github.com/livrenfro/Comparative-Analysis-on-Hybrid-SA-Models
This thesis is restricted to the Claremont Colleges current faculty, students, and staff.