Loading…

Predicting adolescent suicidal behavior following inpatient discharge using structured and unstructured data

The objective was to develop and assess performance of an algorithm predicting suicide-related ICD codes within three months of psychiatric discharge. This prognostic study used a retrospective cohort of EHR data from 2789 youth (12 to 20 years old) hospitalized in a safety net institution in the No...

Full description

Saved in:
Bibliographic Details
Published in:Journal of affective disorders 2024-04, Vol.350, p.382-387
Main Authors: Carson, Nicholas J., Yang, Xinyu, Mullin, Brian, Stettenbauer, Elizabeth, Waddington, Marin, Zhang, Alice, Williams, Peyton, Rios Perez, Gabriel E., Cook, Benjamin Lê
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The objective was to develop and assess performance of an algorithm predicting suicide-related ICD codes within three months of psychiatric discharge. This prognostic study used a retrospective cohort of EHR data from 2789 youth (12 to 20 years old) hospitalized in a safety net institution in the Northeastern United States. The dataset combined structured data with unstructured data obtained through natural language processing of clinical notes. Machine learning approaches compared gradient boosting to random forest analyses. Area under the ROC and precision-recall curve were 0.88 and 0.17, respectively, for the final Gradient Boosting model. The cutoff point of the model-generated predicted probabilities of suicide that optimally classified the individual as high risk or not was 0.009. When applying the chosen cutoff (0.009) to the hold-out testing set, the model correctly identified 8 positive cases out of 10, and 418 negative cases out 548. The corresponding performance metrics showed 80 % sensitivity, 76 % specificity, 6 % PPV, 99 % NPV, F-1 score of 0.11, and an accuracy of 76 %. The data in this study comes from a single health system, possibly introducing bias in the model's algorithm. Thus, the model may have underestimated the incidence of suicidal behavior in the study population. Further research should include multiple system EHRs. These performance metrics suggest a benefit to including both unstructured and structured data in design of predictive algorithms for suicidal behavior, which can be integrated into psychiatric services to help assess risk. •Suicidal behavior in adolescents is elevated the months following inpatient care.•Machine learning (ML) applied to EHR yields predictive tools for suicide post-discharge.•Including structured and unstructured data into ML models has not been well assessed.•Including both data sources yielded good performance of a predictive algorithm.•Natural language processing of unstructured data may enhance such predictive tools.
ISSN:0165-0327
1573-2517
DOI:10.1016/j.jad.2023.12.059