Date of Award


Degree Type

Open Access Dissertation

Degree Name

Mathematics, PhD


Institute of Mathematical Sciences

Advisor/Supervisor/Committee Chair

Qidi Pen

Dissertation or Thesis Committee Member

John Angus

Dissertation or Thesis Committee Member

David Drew

Dissertation or Thesis Committee Member

Cherie Ichinose

Terms of Use & License Information

Terms of Use for work posted in Scholarship@Claremont.

Rights Information

© 2024 Sunny Nguyet Le


Academic achievement, Machine learning, Self-efficacy, Ethnic group

Subject Categories

Education | Mathematics


The Introductory to Psychology Statistics course stands as a notable challenge for psychology majors, often acting as a gatekeeper course. This study embarks on two primary objectives using machine learning techniques: (1) to identify the determinants of overall course achievement, specifically course grade, and (2) to investigate the influence of statistics anxiety and statistics self-efficacy, when both are present, on overall course grade. Employing a machine-learning approach, both objectives were effectively addressed. The study involved the development of a self-reported questionnaire consisting of perceptions of statistics anxiety and statistics self-efficacy, along with other demographic and academic background variables. Conducted at California State University, Fullerton, with its diverse population of approximately 40,000, this study subjected the collected data to a factor importance analysis to assess their impact on course grades. With the dataset containing mixed data types, including numerical and categorical variables, analyzing their relative importance posed a significant challenge. To address this challenge, novel “model-free” rankings scores were employed: ensemble vote count and average partial dependence ranking score. These innovative ranking scores, applicable across regression and classification methods, as well as numerical and categorical factors, were derived from a voting ensemble comprising four competitive base learners: forward and backward stepwise subset selections, LASSO, and random forest. Utilizing the ensemble vote count derived from various machine learning algorithms, this study unveils a robust methodology for identifying significant factors contributing to academic achievement in introductory psychology statistics courses. Notably, statistics self-efficacy and statistics anxiety emerged as the top determinants, with higher levels of the former correlating positively with academic success, while increased levels of the latter were associated with lower course grades. Furthermore, the study’s examination of gender and ethnic group differences unveiled disparities in self-efficacy perceptions, highlighting the complex interplay between psychological factors and academic outcomes. These findings offer valuable insights for educators and policymakers seeking to address challenges and foster success in statistics education. Moreover, this study underscores the potential of machine learning techniques in analyzing educational data and identifying factors influencing academic achievement, paving the way for future research in this fast-growing field.