Date of Award
Fall 2024
Degree Type
Restricted to Claremont Colleges Dissertation
Degree Name
Information Systems and Technology, PhD
Program
Center for Information Systems and Technology
Advisor/Supervisor/Committee Chair
Chinazunwa Uwaoma
Dissertation or Thesis Committee Member
Mark Abdollahian
Dissertation or Thesis Committee Member
Sarah Osailan
Terms of Use & License Information
Rights Information
© 2024 Marielle Garcia-Huynh
Keywords
anxiety, data science, depression, health informatics, machine learning, substance use disorder
Subject Categories
Applied Mathematics
Abstract
The pervasive impact of substance use disorder, depression, and anxiety necessitates advanced predictive strategies to mitigate these conditions' societal and healthcare burdens effectively. Employing a quantitative methodology, this study focuses on predicting individuals at risk for substance use disorder, depression, or anxiety using health plan data. The research utilizes several machine learning algorithms including Logistic Regression, Random Forest, Support Vector Machines (SVM), XGBoost, K-Nearest Neighbors (KNN), Naïve Bayes, Decision Trees, Neural Networks, CatBoost, and Ensemble Learning. By examining medical diagnoses from the 12 months prior to the first diagnosis of these conditions, the study developed models to identify individuals likely to develop these mental health disorders. It also assessed the accuracy and reliability of various machine learning models over a year. Results demonstrated that Random Forest, Neural Networks, and XGBoost outperformed other models, with Random Forest achieving an accuracy of 0.90 and an AUC of 0.92. However, the ensemble learning approach using Bayesian Model Averaging (BMA) provided the most robust results, with a Test Set Accuracy of 0.8352 and an AUC of 0.8925. Multiple metrics, including accuracy, precision, recall, specificity, and F1 score, were used for evaluation. The study concluded that machine learning models, especially ensemble techniques, are effective in predicting mental health disorders and can enhance patient outcomes and healthcare efficiency. The research contributes to healthcare analytics by not only offering actionable insights for improving patient care and resource allocation, but a robust method on measuring how effective machine learning models are in identifying possible SUD, anxiety and depression in health plan data.
ISBN
9798346861065
Recommended Citation
Garcia-Huynh, Marielle Dizon. (2024). Enhancing Risk Stratification for Substance Use Disorder, Depression, and Anxiety through Quantitative Predictive Analytics. CGU Theses & Dissertations, 898. https://scholarship.claremont.edu/cgu_etd/898.