Graduation Year

2017

Date of Submission

5-2017

Document Type

Open Access Senior Thesis

Degree Name

Bachelor of Arts

Department

Mathematics

Second Department

Computer Science

Reader 1

Blake Hunter

Terms of Use & License Information

Terms of Use for work posted in Scholarship@Claremont.

Rights Information

© 2017 Alex A Waggoner

Abstract

Topic modeling refers to the process of algorithmically sorting documents into categories based on some common relationship between the documents. This common relationship between the documents is considered the “topic” of the documents. Sentiment analysis refers to the process of algorithmically sorting a document into a positive or negative category depending whether this document expresses a positive or negative opinion on its respective topic. In this paper, I consider the open problem of document classification into a topic category, as well as a sentiment category. This has a direct application to the retail industry where companies may want to scour the web in order to find documents (blogs, Amazon reviews, etc.) which both speak about their product, and give an opinion on their product (positive, negative or neutral). My solution to this problem uses a Non-negative Matrix Factorization (NMF) technique in order to determine the topic classifications of a document set, and further factors the matrix in order to discover the sentiment behind this category of product.

Share

COinS