A comparison of approaches to improve worst-case predictive model performance over patient subpopulations

Pfohl, Stephen R; Zhang, Haoran; Xu, Yizhe; Foryciarz, Agata; Ghassemi, Marzyeh; Shah, Nigam H

dc.creator	Pfohl, Stephen R
dc.creator	Zhang, Haoran
dc.creator	Xu, Yizhe
dc.creator	Foryciarz, Agata
dc.creator	Ghassemi, Marzyeh
dc.creator	Shah, Nigam H
dc.date	2022-07-13T17:56:33Z
dc.date	2022-07-13T17:56:33Z
dc.date	2022
dc.date	2022-07-13T17:54:23Z
dc.date.accessioned	2023-02-17T20:08:03Z
dc.date.available	2023-02-17T20:08:03Z
dc.identifier	https://hdl.handle.net/1721.1/143724
dc.identifier	Pfohl, Stephen R, Zhang, Haoran, Xu, Yizhe, Foryciarz, Agata, Ghassemi, Marzyeh et al. 2022. "A comparison of approaches to improve worst-case predictive model performance over patient subpopulations." Scientific Reports, 12 (1).
dc.identifier.uri	http://localhost:8080/xmlui/handle/CUHPOERS/242124
dc.description	<jats:title>Abstract</jats:title><jats:p>Predictive models for clinical outcomes that are accurate on average in a patient population may underperform drastically for some subpopulations, potentially introducing or reinforcing inequities in care access and quality. Model training approaches that aim to maximize worst-case model performance across subpopulations, such as distributionally robust optimization (DRO), attempt to address this problem without introducing additional harms. We conduct a large-scale empirical study of DRO and several variations of standard learning procedures to identify approaches for model development and selection that consistently improve disaggregated and worst-case performance over subpopulations compared to standard approaches for learning predictive models from electronic health records data. In the course of our evaluation, we introduce an extension to DRO approaches that allows for specification of the metric used to assess worst-case performance. We conduct the analysis for models that predict in-hospital mortality, prolonged length of stay, and 30-day readmission for inpatient admissions, and predict in-hospital mortality using intensive care data. We find that, with relatively few exceptions, no approach performs better, for each patient subpopulation examined, than standard learning procedures using the entire training dataset. These results imply that when it is of interest to improve model performance for patient subpopulations beyond what can be achieved with standard practices, it may be necessary to do so via data collection techniques that increase the effective sample size or reduce the level of noise in the prediction problem.</jats:p>
dc.format	application/pdf
dc.language	en
dc.publisher	Springer Science and Business Media LLC
dc.relation	10.1038/S41598-022-07167-7
dc.relation	Scientific Reports
dc.rights	Creative Commons Attribution 4.0 International license
dc.rights	https://creativecommons.org/licenses/by/4.0/
dc.source	Scientific Reports
dc.title	A comparison of approaches to improve worst-case predictive model performance over patient subpopulations
dc.type	Article
dc.type	http://purl.org/eprint/type/JournalArticle

Files in this item

Files	Size	Format	View
s41598-022-07167-7.pdf	1.431Mb	application/pdf	View/Open

This item appears in the following Collection(s)

DSpace@MIT [2699]
DSpace@MIT is a digital repository for MIT's research, including peer-reviewed articles, technical reports, working papers, theses, and more.

Show simple item record

Search DSpace

Advanced Search

Browse

All of DSpace
This Collection
- By Issue Date
- Authors
- Titles
- Subjects

A comparison of approaches to improve worst-case predictive model performance over patient subpopulations

Files in this item

This item appears in the following Collection(s)

Search DSpace

Browse

All of DSpace

This Collection