Date of Award
Fall 2020
Degree Type
Open Access Dissertation
Degree Name
Computational Science Joint PhD with San Diego State University, PhD
Program
Institute of Mathematical Sciences
Advisor/Supervisor/Committee Chair
Richard Levine
Dissertation or Thesis Committee Member
John Angus
Dissertation or Thesis Committee Member
Barbara Bailey
Dissertation or Thesis Committee Member
Juanjuan Fan
Terms of Use & License Information
Rights Information
© 2020 Joshua Beemer
Keywords
Educational Data Mining, Ensemble Learning, Higher Education, Learning Analytics, Propensity Score
Abstract
Student success efficacy studies are aimed at assessing instructional practices and learning environments by evaluating the success of and characterizing student subgroups that may benefit from such modalities. We develop an ensemble learning approach to perform these analytics tasks with specific focus on estimating individualized treatment effects (ITE). ITE are a measure from the personalized medicine literature that can, for each student, quantify the impact of the intervention strategy on student performance, even though the given student either did or did not experience this intervention (i.e., is either in the treatment group or in the control group). We illustrate our learning analytics methods in the study of a supplemental instruction component for a large enrollment introductory statistics course recognized as a curriculum bottleneck at San Diego State University. As part of this application, we show how the ensemble estimate of the ITE may be used to assess the pedagogical reform (supplemental instruction), advise students into supplemental instruction at the beginning of the course, and quantify the impact of the supplemental instruction component on at-risk subgroups.
Higher Education researchers and Institutional Research practitioners struggle with the analysis of observational study data and estimation of treatment effects. Propensity score matching has widely been accepted to counteract inherent selection bias in these studies. We present an ensemble learner for propensity score estimation, and consider the use of inverse probability of treatment weighting (IPTW), variance stabilization weighting, and weight truncation to improve treatment effect estimation over propensity score matching.
We run a simulation study to validate the treatment effect and propensity score estimation performance of the ensemble learner compared to logistic regression and random forest within the matching and weighting techniques. The results show that the use of the ensemble learner and variance stabilization with truncation result in the lowest mean squared error for treatment effect estimation. We contribute a new package in the statistical software environment R, matchED, that will provide educational researchers with a tool to help analyze student success study data and present actionable results. A tutorial guides the user through the use of each function and it's parameters. A student success intervention is evaluated using the matchED package, and we are able to show that the intervention does help reduce an inherent equity gap between students in the intervention and their peers.
Recommended Citation
Beemer, Joshua Ryan. (2020). Ensemble Learning Methods for Educational Data Mining Applications. CGU Theses & Dissertations, 282. https://scholarship.claremont.edu/cgu_etd/282.