Redirigiendo al acceso original de articulo en 18 segundos...
ARTÍCULO
TITULO

A Simple Free-Text-like Method for Extracting Semi-Structured Data from Electronic Health Records: Exemplified in Prediction of In-Hospital Mortality

Eyal Klang    
Matthew A. Levin    
Shelly Soffer    
Alexis Zebrowski    
Benjamin S. Glicksberg    
Brendan G. Carr    
Jolion Mcgreevy    
David L. Reich and Robert Freeman    

Resumen

The Epic electronic health record (EHR) is a commonly used EHR in the United States. This EHR contain large semi-structured ?flowsheet? fields. Flowsheet fields lack a well-defined data dictionary and are unique to each site. We evaluated a simple free-text-like method to extract these data. As a use case, we demonstrate this method in predicting mortality during emergency department (ED) triage. We retrieved demographic and clinical data for ED visits from the Epic EHR (1/2014?12/2018). Data included structured, semi-structured flowsheet records and free-text notes. The study outcome was in-hospital death within 48 h. Most of the data were coded using a free-text-like Bag-of-Words (BoW) approach. Two machine-learning models were trained: gradient boosting and logistic regression. Term frequency-inverse document frequency was employed in the logistic regression model (LR-tf-idf). An ensemble of LR-tf-idf and gradient boosting was evaluated. Models were trained on years 2014?2017 and tested on year 2018. Among 412,859 visits, the 48-h mortality rate was 0.2%. LR-tf-idf showed AUC 0.98 (95% CI: 0.98?0.99). Gradient boosting showed AUC 0.97 (95% CI: 0.96?0.99). An ensemble of both showed AUC 0.99 (95% CI: 0.98?0.99). In conclusion, a free-text-like approach can be useful for extracting knowledge from large amounts of complex semi-structured EHR data.