Analysis and Predictive Modeling of Anxiety and Depression Prevalence in the US Since the Onset of
Ronit Chakraborty (Research Intern, Lake Forest College) Dr. Sugata Banerji (Professor, Lake Forest College)
Poster # 63
In this study we have collected and analyzed the data on anxiety and depressive disordersin all 50 states in the US since the onset of the COVID-19 pandemic until July, 2023 in conjunction with 24 potential causative factors in the categories of socioeconomic indicators, health insurance coverage, healthcare infrastructure, usage of therapy, mental health medications and telemedicine, demographics, penetration of social media, and the prevalence of long COVID. The datasets were sourced from Centers for Disease Control and Prevention - Household Pulse Survey 2020-2023, US Census Bureau, Kaiser Family Foundation, and Internet World Stats. We employed three different analysis and modeling techniques yielding insightful results. First, t-distributed stochastic neighbor embedding (t-SNE) clustering: we mapped 24 factors into 2 dimensions that shows that the states with high rates of anxiety/depression form a cluster with 7 out of the top 10 high prevalence states being part of the cluster. This indicates a substantial predictive relationship between the explanatory variables and the rate of anxiety/depression. Second, correlations analysis: we showed that availability of private health insurance has the highest correlation (-0.77) with the occurrence of anxiety or depression; higher availability of private insurance correlates with lower occurrences of anxiety/depression. Two other notable factors are the occurrence of significant activity limitations from long COVID and the prevalence of long COVID. Correlations are positive in both the cases indicating an increase in anxiety or depression rates with long COVID. Third, random forest-based machine learning model: we split the data randomly into training and test sets, and trained a random forest model to predict the rates of anxiety or depression. We obtained a 95% predictive efficacy on the training set and 69% on the test set indicating a significant predictive relationship between the factors and anxiety/depression prevalence.