Date of Award

2025

Document Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Program

Health Outcomes and Policy Research

Track

Health Services Research

Research Advisor

Charisse Madlock-Brown

Committee

Rebecca Reynolds; Satish Kedia; Satya Surbhi; Simonne Nouer

Keywords

lung cancer, lung cancer screening, machine learning, prediction models, social determinants of health

Abstract

This research aims to assess the impact of incorporating social determinants of health (SDoH) into prediction screening models for predicting lung cancer among high risk-patients. Lung cancer is typically diagnosed at a later stage when surival and health outcomes are poor. Screening guidelines are currently in place; however, patients at high risk can be missed. There is a need for short-term predictions, to better identify patients at an earlier stage. SDoH features, such as socioeconomic satus, can provide social context that improves identification of patients who might have otherwise been overlooked. The proposed framework includes developing an early detection tool that will incorporate more inclusive criteria for predicting lung cancer while using data that is routinely collected in EHR databases and can be feasibly incorporated into clinical settings. This research demonstrated that including SDoH features within prediction models allows for better identification of patients at risk of developing lung cancer. First, the study assessed the associations social determinants and lung cancer incidence. Specific clinical features, along with higher vulnerability in the SDoH domains of ethnic and minority status and housing type and transportation, were identified as significant features associated with lung cancer. These findings demonstrated that social and environmental factors contribute meaningfully to risk beyond traditional clinical variables. Second, this study developed a screening model for high risk patients, that includes SDoH features. Results showed that adding SDoH to prediction model improves model performance in identifying high risk patients. Lastly, this study evaluated models, with and without SDoH, for algorithmic unfairness across specific subgroups. It was demonstrated that the addition of SDoH to the prediction model improved fairness for some subgroups, but reduced overall performance for others. These findings emphasize the heterogeneous effects of prediction models, demonstrating that while risk identification can be enhanced for some populations, biases may be exacerbated for other populations. This research can help to better identify patients who are at risk at an earlier stage, as well as capture patients who would have traditionally been missed. Early identification of patients at high risk for developing lung cancer will allow clinicians to efficiently implement recognized screening guidelines, identify lung cancer at earlier stages, and ultimately, decrease adverse health outcomes amongst patients with the highest risk of developing lung cancer. Future work could evaluate integration of individual-level SDoH, and assess model performance over a longer follow-up duration.

ORCID

0009-0001-2367-816X

DOI

10.21007/etd.cghs.2025.0705

Included in

Public Health Commons

Share

COinS