Graduation Year

2016

Date of Submission

4-2016

Document Type

Open Access Senior Thesis

Degree Name

Bachelor of Arts

Reader 1

Beth Trushkowsky

Terms of Use & License Information

Terms of Use for work posted in Scholarship@Claremont.

Rights Information

© 2016 Amit Maor

Abstract

Data analytics queries often involve aggregating over massive amounts of data, in order to detect trends in the data, make predictions about future data, and make business decisions as a result. As such, it is important that a database management system (DBMS) handling data analytics queries perform well when those queries involve massive amounts of data. A data warehouse is a DBMS which is designed specifically to handle data analytics queries.

This thesis describes the data warehouse Amazon Redshift, and how it was used to design a data analysis system for Laserfiche. Laserfiche is a software company that provides each of their clients a system to store and process business process data. Through the 2015-16 Harvey Mudd College Clinic project, the Clinic team built a data analysis system that provides Laserfiche clients with near real-time reports containing analyses of their business process data. This thesis discusses the advantages of Redshift’s data model and physical storage layout, as well as Redshift’s features directly benefit of the data analysis system.

Share

COinS