Graduation Year
2018
Date of Submission
12-2017
Document Type
Campus Only Senior Thesis
Degree Name
Bachelor of Arts
Department
Mathematics
Reader 1
Blake Hunter
Terms of Use & License Information
Rights Information
© 2017 Sydney Smith
Abstract
This paper explores topic modeling through the example text of Alice in Wonderland. It explores both singular value decomposition as well as non-‐‑negative matrix factorization as methods for feature extraction. The paper goes on to explore methods for partially supervised implementation of topic modeling through introducing themes. A large portion of the paper also focuses on implementation of these techniques in python as well as visualizations of the results which use a combination of python, html and java script along with the d3 framework. The paper concludes by presenting a mixture of SVD, NMF and partially-‐‑supervised NMF as a possible way to improve topic modeling.
Recommended Citation
Smith, Sydney, "Approaches to Natural Language Processing" (2018). CMC Senior Theses. 1817.
https://scholarship.claremont.edu/cmc_theses/1817
This thesis is restricted to the Claremont Colleges current faculty, students, and staff.