The use of Electronic Health Records (EHR)/Electronic Medical Records (EMR) data is becoming more prevalent for research. However, analysis of this type of data has many unique complications due to how they are collected, processed and types of questions that can be answered. This book covers many important topics related to using EHR/EMR data for research including data extraction, cleaning, processing, analysis, inference, and predictions based on many years of practical experience of the authors. The book carefully evaluates and compares the standard statistical models and approaches with those of machine learning and deep learning methods and reports the unbiased comparison results for these methods in predicting clinical outcomes based on the EHR data.
- Written based on hands-on experience of contributors from multidisciplinary EHR research projects, which include methods and approaches from statistics, computing, informatics, data science and clinical/epidemiological domains.
- Documents the detailed experience on EHR data extraction, cleaning and preparation
- Provides a broad view of statistical approaches and machine learning prediction models to deal with the challenges and limitations of EHR data.
- Considers the complete cycle of EHR data analysis.
The use of EHR/EMR analysis requires close collaborations between statisticians, informaticians, data scientists and clinical/epidemiological investigators. This book reflects that multidisciplinary perspective.
Table of Contents
About the Editors
List of Contributors
- Introduction: Use of EHR Data for Scientific Discoveries—Challenges and Opportunities
- EHR Project Management
- EHR Databases and Data Management: Data Query and Extraction
- EHR Data Cleaning
- EHR Data Pre-Processing and Preparation
- EHR Missing Data Issues
- Causal Inference and Analysis for EHR Data
- EHR Data Exploration, Analysis and Predictions: Statistical Models and Methods
- Neural Network and Deep Learning Methods for EHR Data
- EHR Data Analytics and Predictions: Machine Learning Methods
- Use of EHR Data for Research: Future
Yashar Talebi and Ashraf Yaseen
Gen Zhu, Vi K. Ly, Michael Gonzalez, Leqing Wu, Hulin Wu, and Ashraf Yaseen
Yashar Talebi, Han Feng, Yuefan Huang, and Vahed Maroufy
Duo Yu, Xueying Wang, and Hulin Wu
Chenguang Zhang, Vahed Maroufy, Baojiang Chen, and Hulin Wu
Stacia DeSantis, Momiao Xiong, Jose-Miguel Yamal, Gen Zhu, Duo Yu, Xueying Wang, Chenguang Zhang, and Vi K. Ly
Gen Zhu, Frances Brito, Stacia M DeSantis, and Vahed Maroufy
Duo Yu, Ashraf Yaseen, and Xi Luo
Yuxuan Gu, Yuefan Huang, Vi Ly, Ashraf Yaseen, and Hongyu Miao
- Hulin Wu , PhD, the endowed Betty Wheless Trotter Professor and Chair, Department of Biostatistics & Data Science, School of Public Health (SPH), University of Texas Health Science Center at Houston (UTHealth). Dr. Wu also holds a joined appointment as Professor at UTHealth School of Biomedical Informatics. Dr. Wu received BS and MS training in engineering and PhD in statistics. He has many years of experience in developing novel statistical methods, mathematical models and informatics tools for biomedical data analysis and modeling. He is the Founding Director of the Center for Big Data in Health Sciences (CBD-HS) and he is directing the EHR research working group at UTHealth SPH.
- Dr. Yamal is a tenured Associate Professor in the Department of Biostatistics & Data Science and a member of the Coordinating Center for Clinical Trials at UTHealth School of Public Health. Dr. Yamal has extensive experience in clinical trials including data coordinating centers and serving on Data Safety Monitoring Boards for clinical trials in stroke and traumatic brain injury. He has also contributed towards statistical methodology for classification problems for nested data as well as machine learning applications.
- Ashraf Yaseen is an Assistant Professor of Data Science at the School of Public Health, UTHealth. He has extensive experience in database design, implementation and management, machine learning, and high-performance computing. In his current research work, Dr. Yaseen is exploring big data integration and deep learning technologies in electronic health records to address clinical and public health questions.
- Vahed Maroufy, PhD, Assistant Professor, Department of Biostatistics & Data Science, UTHealth School of Public Health. Dr. Maroufy received MSc and PhD training in statistics and has experience in applied and theoretical statistics, including geometry of statistical models, mixture models, Bayesian inference, predictive models using EHR data, and analysis of genetic data in cancer research.