Date of Award

Fall 2024

Degree Type

Restricted to Claremont Colleges Dissertation

Degree Name

Information Systems and Technology, PhD

Program

Center for Information Systems and Technology

Advisor/Supervisor/Committee Chair

Chinazunwa Uwaoma

Dissertation or Thesis Committee Member

Mark Abdollahian

Dissertation or Thesis Committee Member

Sarah Osailan

Terms of Use & License Information

Terms of Use for work posted in Scholarship@Claremont.

Rights Information

© 2024 Marielle Garcia-Huynh

Keywords

anxiety, data science, depression, health informatics, machine learning, substance use disorder

Subject Categories

Applied Mathematics

Abstract

The pervasive impact of substance use disorder, depression, and anxiety necessitates advanced predictive strategies to mitigate these conditions' societal and healthcare burdens effectively. Employing a quantitative methodology, this study focuses on predicting individuals at risk for substance use disorder, depression, or anxiety using health plan data. The research utilizes several machine learning algorithms including Logistic Regression, Random Forest, Support Vector Machines (SVM), XGBoost, K-Nearest Neighbors (KNN), Naïve Bayes, Decision Trees, Neural Networks, CatBoost, and Ensemble Learning. By examining medical diagnoses from the 12 months prior to the first diagnosis of these conditions, the study developed models to identify individuals likely to develop these mental health disorders. It also assessed the accuracy and reliability of various machine learning models over a year. Results demonstrated that Random Forest, Neural Networks, and XGBoost outperformed other models, with Random Forest achieving an accuracy of 0.90 and an AUC of 0.92. However, the ensemble learning approach using Bayesian Model Averaging (BMA) provided the most robust results, with a Test Set Accuracy of 0.8352 and an AUC of 0.8925. Multiple metrics, including accuracy, precision, recall, specificity, and F1 score, were used for evaluation. The study concluded that machine learning models, especially ensemble techniques, are effective in predicting mental health disorders and can enhance patient outcomes and healthcare efficiency. The research contributes to healthcare analytics by not only offering actionable insights for improving patient care and resource allocation, but a robust method on measuring how effective machine learning models are in identifying possible SUD, anxiety and depression in health plan data.

ISBN

9798346861065

Available for download on Saturday, January 09, 2027

Share

COinS