Date of Award
2025
Document Type
Dissertation
Degree Name
Doctor of Philosophy (PhD)
Program
Health Outcomes and Policy Research
Track
Health Services Research
Research Advisor
Charisse Madlock-Brown
Committee
Rebecca Reynolds; Satish Kedia; Satya Surbhi; Simonne Nouer
Keywords
lung cancer, lung cancer screening, machine learning, prediction models, social determinants of health
Abstract
This research aims to assess the impact of incorporating social determinants of health (SDoH) into prediction screening models for predicting lung cancer among high risk-patients. Lung cancer is typically diagnosed at a later stage when surival and health outcomes are poor. Screening guidelines are currently in place; however, patients at high risk can be missed. There is a need for short-term predictions, to better identify patients at an earlier stage. SDoH features, such as socioeconomic satus, can provide social context that improves identification of patients who might have otherwise been overlooked. The proposed framework includes developing an early detection tool that will incorporate more inclusive criteria for predicting lung cancer while using data that is routinely collected in EHR databases and can be feasibly incorporated into clinical settings. This research demonstrated that including SDoH features within prediction models allows for better identification of patients at risk of developing lung cancer. First, the study assessed the associations social determinants and lung cancer incidence. Specific clinical features, along with higher vulnerability in the SDoH domains of ethnic and minority status and housing type and transportation, were identified as significant features associated with lung cancer. These findings demonstrated that social and environmental factors contribute meaningfully to risk beyond traditional clinical variables. Second, this study developed a screening model for high risk patients, that includes SDoH features. Results showed that adding SDoH to prediction model improves model performance in identifying high risk patients. Lastly, this study evaluated models, with and without SDoH, for algorithmic unfairness across specific subgroups. It was demonstrated that the addition of SDoH to the prediction model improved fairness for some subgroups, but reduced overall performance for others. These findings emphasize the heterogeneous effects of prediction models, demonstrating that while risk identification can be enhanced for some populations, biases may be exacerbated for other populations. This research can help to better identify patients who are at risk at an earlier stage, as well as capture patients who would have traditionally been missed. Early identification of patients at high risk for developing lung cancer will allow clinicians to efficiently implement recognized screening guidelines, identify lung cancer at earlier stages, and ultimately, decrease adverse health outcomes amongst patients with the highest risk of developing lung cancer. Future work could evaluate integration of individual-level SDoH, and assess model performance over a longer follow-up duration.
ORCID
0009-0001-2367-816X
DOI
10.21007/etd.cghs.2025.0705
Recommended Citation
Jackson, Bianca (0009-0001-2367-816X), "Incorporating Social Determinants of Health Into a Screening Tool Using Machine Learning to Predict Lung Cancer Diagnosis Among High Risk Patients" (2025). Theses and Dissertations (ETD). Paper 725. http://dx.doi.org/10.21007/etd.cghs.2025.0705.
https://dc.uthsc.edu/dissertations/725