Electrical and Computer Engineering ETDs
Publication Date
Summer 7-8-2019
Abstract
Electronic health records contain the clinical history of patients. The enormous potential for discovery in such a rich dataset is hampered by their complexity. We hypothesize that machine learning models trained on EHR data can predict future clinical events significantly better than current models. We analyze an EHR database of 594,862 Echocardiography studies from 272,280 unique patients with both unsupervised and supervised machine learning techniques.
In the unsupervised approach, we first develop a simulation framework to evaluate a family of different clustering pipelines. We apply the optimized approach to 41,645 patients with heart failure without providing any survival information to the underlying clustering approach. The model separates patients with significantly different survival characteristics. For example, in a 10-cluster model, the minimum and maximum risk clusters had a median survival of 22 and 53 months respectively.
In the supervised approach, with 723,754 videos available from 27,028 unique patients, we assess the predictive capacity of Echocardiography video data for one-year mortality. Also, we hold out a balanced dataset of 600 patients to compare the model performance against cardiologists. We found that the best model, among four candidate architectures, is a 3D dyadic CNN model with an average AUC of 0.78 for a single parasternal long axis view. The model yields an accuracy of 75% (AUC of 0.8) on the held-out dataset while the cardiologists achieve 56% and 61%. The model performance was significantly higher than that of the cardiologists (p = 4.2e-11 and p=6.9e-7).
Finally, we develop a multi-modal supervised approach that enables interpretability. The model provides interpretations through polynomial transformations that describe the individual feature contribution and weights the transformed features to determine their importance. We validate our proposed approach using 31,278 videos from 26,793 patients. We test our proposed approach against logistic regression and non-linear and non-interpretable models based on Random Forests and XGBoost. Our results show that the proposed neural network architecture always outperforms logistic regression models while its performance approximates the other non-linear models. Overall, our multi-modal classifier based on 3D dyadic CNN and the interpretable neural network outperforms all other classifiers (AUC=0.83).
Keywords
Echocardiography, video analysis, interpretable neural network
Document Type
Dissertation
Language
English
Degree Name
Computer Engineering
Level of Degree
Doctoral
Department Name
Electrical and Computer Engineering
First Committee Member (Chair)
Marios Pattichis
Second Committee Member
Manuel Martinez-Ramon
Third Committee Member
Constantinos Pattichis
Fourth Committee Member
Brandon Fornwalt
Recommended Citation
Ulloa Cerna, Alvaro Emilio. "Large Scale Electronic Health Record Data and Echocardiography Video Analysis for Mortality Risk Prediction." (2019). https://digitalrepository.unm.edu/ece_etds/467
Included in
Biomedical Engineering and Bioengineering Commons, Computer Engineering Commons, Electrical and Computer Engineering Commons