Researcher ORCID Identifier


Graduation Year


Date of Submission


Document Type

Open Access Senior Thesis

Degree Name

Bachelor of Arts


Computer Science

Second Department

Mathematical Sciences

Reader 1

Mark Huber

Terms of Use & License Information

Terms of Use for work posted in Scholarship@Claremont.

Rights Information

© 2022 Emanuel Jarquin


In the modern era, sports betting is becoming increasingly popular. This is especially true in the realm of soccer (or ‘football’ as it is known outside the United States). As a result, the concept of attempting to predict the outcomes of soccer matches using machine learning has garnered much attention in recent years. In this thesis, I utilize well-known machine learning techniques to predict the outcomes of El Clásico matchups and compare the predictive performance of these techniques. The predictive methods employed for this thesis are random forests using the party package in R and extreme gradient boosting using the xgboost package. The dataset that will be used has been created using historical soccer data that includes match and team statistics.

Included in

Data Science Commons