Background

ResProt

JMIR Res Protoc

JMIR Research Protocols

1929-0748

JMIR Publications

Toronto, Canada

v11i3e34201

35333179

10.2196/34201

Protocol

Leveraging Large-Scale Electronic Health Records and Interpretable Machine Learning for Clinical Decision Making at the Emergency Department: Protocol for System Development and Validation

Eysenbach

Gunther

Wang

Yeli

Ndabu

Theophile

Liu

Nan

PhD 1

Programme in Health Services and Systems Research Duke-NUS Medical School

8 College Road

Singapore, 169857

Singapore 65 66016503 liu.nan@duke-nus.edu.sg

2 3 4

https://orcid.org/0000-0003-3610-4883

Xie

Feng

BSc 1

https://orcid.org/0000-0002-0215-667X

Siddiqui

Fahad Javaid

MBBS, MSc 1

https://orcid.org/0000-0002-9046-5105

Andrew Fu Wah

MBBS, MPH 1 5

https://orcid.org/0000-0003-4338-3876

Chakraborty

Bibhas

PhD 1 6 7

https://orcid.org/0000-0002-7366-0478

Nadarajan

Gayathri Devi

MBBS 5

https://orcid.org/0000-0001-8811-7374

Tan

Kenneth Boon Kiat

MBBS 5

https://orcid.org/0000-0003-2167-690X

Ong

Marcus Eng Hock

MBBS, MPH 1 4 5

https://orcid.org/0000-0001-7874-7612

1 Programme in Health Services and Systems Research Duke-NUS Medical School

Singapore

Singapore 2 Institute of Data Science National University of Singapore

Singapore

Singapore 3 SingHealth AI Health Program Singapore Health Services

Singapore

Singapore 4 Health Service Research Centre Singapore Health Services

Singapore

Singapore 5 Department of Emergency Medicine Singapore General Hospital

Singapore

Singapore 6 Department of Statistics and Data Science National University of Singapore

Singapore

Singapore 7 Department of Biostatistics and Bioinformatics Duke University

Durham, NC

United States

Corresponding Author: Nan Liu liu.nan@duke-nus.edu.sg

3 2022

25 3 2022

11 3

e34201

11 10 2021 23 11 2021 29 11 2021 30 11 2021

©Nan Liu, Feng Xie, Fahad Javaid Siddiqui, Andrew Fu Wah Ho, Bibhas Chakraborty, Gayathri Devi Nadarajan, Kenneth Boon Kiat Tan, Marcus Eng Hock Ong. Originally published in JMIR Research Protocols (https://www.researchprotocols.org), 25.03.2022.

2022

This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Research Protocols, is properly cited. The complete bibliographic information, a link to the original publication on https://www.researchprotocols.org, as well as this copyright and license information must be included.

Background

There is a growing demand globally for emergency department (ED) services. An increase in ED visits has resulted in overcrowding and longer waiting times. The triage process plays a crucial role in assessing and stratifying patients’ risks and ensuring that the critically ill promptly receive appropriate priority and emergency treatment. A substantial amount of research has been conducted on the use of machine learning tools to construct triage and risk prediction models; however, the black box nature of these models has limited their clinical application and interpretation.

Objective

In this study, we plan to develop an innovative, dynamic, and interpretable System for Emergency Risk Triage (SERT) for risk stratification in the ED by leveraging large-scale electronic health records (EHRs) and machine learning.

Methods

To achieve this objective, we will conduct a retrospective, single-center study based on a large, longitudinal data set obtained from the EHRs of the largest tertiary hospital in Singapore. Study outcomes include adverse events experienced by patients, such as the need for an intensive care unit and inpatient death. With preidentified candidate variables drawn from expert opinions and relevant literature, we will apply an interpretable machine learning–based AutoScore to develop 3 SERT scores. These 3 scores can be used at different times in the ED, that is, on arrival, during ED stay, and at admission. Furthermore, we will compare our novel SERT scores with established clinical scores and previously described black box machine learning models as baselines. Receiver operating characteristic analysis will be conducted on the testing cohorts for performance evaluation.

Results

The study is currently being conducted. The extracted data indicate approximately 1.8 million ED visits by over 810,000 unique patients. Modelling results are expected to be published in 2022.

Conclusions

The SERT scoring system proposed in this study will be unique and innovative because of its dynamic nature and modelling transparency. If successfully validated, our proposed solution will establish a standard for data processing and modelling by taking advantage of large-scale EHRs and interpretable machine learning tools.

International Registered Report Identifier (IRRID)

DERR1-10.2196/34201

electronic health records machine learning clinical decision making emergency department

Introduction Background

Across the globe, there is increasing demand for emergency department (ED) services [1,2]. Increased ED visits have resulted in overcrowding and long waiting times [3-5]. Furthermore, adverse patient outcomes have been reported, such as mortality [6], poor patient satisfaction, and high costs [7,8]. As the first layer of emergency care in an ED, triage plays an essential role in assessing and stratifying patients’ risks and ensuring that the critically ill receive appropriate emergency treatment promptly [9].

The triage process is commonly conducted by medical staff based on their own clinical experience, the patients’ symptoms, and basic information obtained from patients during their presentation to the ED. To make this critical step more objective, triage systems have been introduced. Some examples of triage systems include the 5-level Emergency Severity Index [10] in the United States, the Australasian Triage Scale [11] in Australia, and the Patient Acuity Category Scale (PACS) [12] in Singapore. They are simple and easy to use but subjective and static. These scores are based on symptoms, but many critically ill patients may not have apparent symptoms when they arrive at the ED and their conditions deteriorate rapidly during their stay in the hospital. To address this limitation, more dynamic and accurate risk prediction tools are required for better patient monitoring throughout the ED journey [13].

In response to this gap of needs, researchers are interested in developing multivariable predictive models and clinical scores to identify patients in the ED at risk of adverse outcomes such as admission [14,15], death [16], cardiac arrests [17], and intensive care unit (ICU) admission [18]. Models such as these are primarily based on patient information, vital sign instability, changes in laboratory results, and administrative records. However, some parameters may appear similar between high-risk patients and other patients during an ED visit, making the prediction models less accurate.

Additional risk factors such as comorbidities, underlying chronic diseases, past hospitalization history, and other patient-related factors should be considered [19]. Furthermore, nonpatient factors are also integral components of patient care that can impact patient outcomes. Research has identified emergency boarding as a risk factor for mortality [6]. In addition, mortality rates were found to be higher for patients admitted during periods of high ED crowding regardless of their demographic characteristics, comorbidities, or primary diagnosis [20]. Changes in shift and high patient-to-nurse ratios have also been factors of concern [21].

In building predictive models, both traditional statistical methods and machine learning tools have been thoroughly investigated. Logistic regression is the most commonly used tool to construct multivariable prediction models [16,22,23]. In recent years, machine learning and artificial intelligence (AI) have gained popularity as tools for improving model performance. Fernandos et al [24] conducted an in-depth review of the current state of AI-based clinical decision support systems for triage. A recent study in the United States demonstrated the value of machine learning models for admission prediction in near real time [13].

While AI has proven successful in developing triage and prediction models, its solutions are often black box models, limiting model interpretation [25] and clinical adoption [26]. Consequently, efforts have been made to develop sparse predictive models by leveraging machine learning and conventional statistical analysis. Ustun and Rudin [27,28] proposed Supersparse Linear Integer Model–based methods for developing interpretable scoring systems. Xie et al [29] developed the interpretable machine learning–based AutoScore framework and used it to derive the score for emergency risk prediction to estimate the probability of mortality during an inpatient stay [30].

Objective

By leveraging large-scale electronic health records (EHRs) and machine learning, we intend to create an innovative, dynamic, and interpretable System for Emergency Risk Triage (SERT) for risk stratification in the ED. This protocol describes the detailed data collection procedures, data manipulation, and predictive modelling to accomplish our goals. In particular, we will employ the AutoScore framework to construct a dynamic SERT for risk assessment at multiple decision points in the ED. Our solution will also be compared with traditional clinical triage tools and black box machine learning algorithms.

Methods Study Setting

This is a large-scale, retrospective, single-center study conducted in Singapore. As a city-state in Southeast Asia with an approximately 5.4 million population, Singapore provides affordable health care through partial subsidies and co-payments. The study site, Singapore General Hospital, is Singapore’s largest and oldest tertiary referral hospital, with 1700 inpatient beds and over 30 clinical specialties. Each year, its ED sees more than 120,000 visits and admits 36,000 patients for inpatient care [16,31].

At public hospitals in Singapore, patients visiting EDs are triaged based on their symptoms according to the national PACS [32]. PACS-1 refers to patients who are seriously ill and require immediate medical care, PACS-2 refers to nonambulant patients who do not appear to be at risk of collapse, PACS-3 refers to ambulant patients, and PACS-4 refers to nonemergency cases. An initial triage is often recommended and used to identify patients who are more acutely ill and need immediate attention. As soon as resuscitation is required, the patient is taken directly to the resuscitation area. Otherwise, the patient will be directed either to a critical care area or a waiting area, depending on the patient’s condition.

Study Cohort and Design

The flowchart of the entire project is shown in Figure 1. In the extracted data set, there are 3 primary identifiers: “ED Case No,” “Admission Case No,” and “Patient ID,” to represent the unique ED visit, the admission case, and the patient, respectively. Figure 2 illustrates how variables are constructed from and linked to these 3 identifiers. By consolidating the selected variables, a master data set will be created. Afterwards, the constructed master data set will be processed with outlier removal and missing value handling. The interpretable machine learning framework will then be implemented, and the models will be evaluated and compared with other baseline approaches, including traditional clinical scores, machine learning, and deep learning.

Figure 1

Flowchart of the study design. EHR: electronic health record.

Figure 2

Illustration of the data linkage process of raw data tables through 3 primary identifiers. BP: blood pressure; ID: identification; ICD: International Classification of Diseases; ED: emergency department; ICU: intensive care unit; HDU: high dependency unit; SpO2: peripheral oxygen saturation; FiO2: fraction of inspired oxygen.

Singapore Health Services’ Centralized Institutional Review Board approved this study (CIRB Ref: 2021/2122), and a waiver of consent was granted to collect and analyze EHRs.

Data Source and Extraction

Study subjects have been drawn from the hospital’s EHRs using the SingHealth-IHiS Electronic Health Intelligence System, which combines data from multiple clinical, operational, and finance data sources [33]. Before analysis, all data, including the 3 primary identifiers, have been de-identified to ensure that they are sufficiently anonymous. Records of deaths are obtained from the national death registry and are matched to specific patients in our database. Relevant variables are extracted from the beginning of the ED visits until the end of the patient’s journey. Moreover, patients’ medical histories are extracted and matched for each unique patient through “Patient ID.” The extracted data were saved in multiple CSV files for subsequent processing and analysis.

Data Cleaning and Preprocessing

Data extracted from EHRs may contain many erroneous entries, as the EHRs are designed for clinical use and not explicitly modified for research purposes. This results in a lot of noise, missing values, outliers, and duplicate or incorrect records due to system problems or clerical errors. These issues will be addressed in several ways. First, wholly duplicated entries will be removed. Second, if the vital signs or laboratory test results are outside the normal range, they are considered outliers. All outliers are marked as missing values and are handled by appropriate imputation methods (eg, the mean or median value imputation based on the training data set). Third, a descriptive analysis will be conducted to determine whether the overall percentage and number are within a reasonable range.

Variable Construction

Candidate variables have been identified based on expert opinions as well as relevant literature [18,30,34-36]. Moreover, we have sought input from clinicians and informaticians familiar with the raw data to determine which features are feasible to extract and construct from the sources. The general rationale is to include all ED-relevant variables of high quality. Therefore, irrelevant, repeated, or largely missing variables will be excluded. For time-series data (such as laboratory test results and vital signs), the first, last, and average measurements are extracted and constructed for each ED episode. Past health care utilization will be derived per the patient’s medical history.

Table 1 presents a list of high-level constructed variables. These variables are classified into 6 main categories depending on the time frame during which the variables could be collected: past medical history, ED triage, ED disposition, within the first 24 hours of inpatient stay, inpatient discharge, and after inpatient discharge. Variables of patient data include demographics, comorbidities, drug history, presenting vital signs, essential laboratory results, and treatments administered in the ED. There are also nonpatient variables such as ED waiting time from triage to consultation, ED boarding time (from consultation to ED disposition), patient load in the ED (number of other patients registered in the ED at that time), time of the day, and day of the week.

Outcomes

The clinical outcomes in this study include the following adverse events experienced by patients during their inpatient stay:

Admission: A hospital admission following an ED visit [37-39]. Each ED attendance is classified as admission or discharge according to the clinical decision made. As a result, patients who left before a decision could be made are excluded rather than considered discharged.

Inpatient death: A clinically certified death of a patient admitted to the hospital and who died during the hospitalization.

2/7/30-day mortality: A clinically certified death of an admitted patient that occurred 2/7/30 days after the ED visit regardless of the place of death.

ICU transfer: Identified using the hospital’s admission, transfer, and discharge database. Whenever a patient had more than one transfer from ward to ICU, only the data before the first transfer were included.

Cardiac arrest: Defined as the loss of a palpable pulse with attempted resuscitation in the ward.

Prolonged hospital length of stay: Defined as 21 days or more for the hospital stay.

Table 1

List of the high-level constructed variables in the master data set, along with their sources and categories.

Category	Subcategory	Source table	High-level variables extracted
Patient history
	Health care utilization summary	Inpatient movement, ED^a root table	Count of ED visits, emergency admissions, surgeries, ICU^b or HDU^c transfer in the patient’s history (past 30/90/180/365 days)
	Comorbidities	Diagnosis history	Charlson Comorbidity Index (17 variables; chronic disease), Elixhauser Comorbidity Index (30 variables)
Information collected at triage station
	Demographics	ED root table	Age, gender, race, nationality, postal code
	ED-prehospital	ED triage	Mode of arrival, high priority (chest pain/suspected stroke case), fever or not
	ED–triage information	ED triage	Triage waiting time, triage class (Patient Acuity Category Scale system), time of the day (midnight or not), day of the week (weekend or not)
	Triage vital signs	ED vital signs	Pulse, respiration, SpO₂^d, systolic BP^e, diastolic BP, temperature
Information collected at ED disposition
	ED vital signs	ED vital signs	Vital measurement frequency and major ED vital readings: pulse, respiration, fraction of inspired oxygen, SpO₂, systolic BP, diastolic BP, temperature, pain level scale, Glasgow coma scale, alert (extracted from physical notes)
	ED laboratory	ED laboratory	Laboratory measurement frequency and major laboratory test results: potassium, creatinine, sodium, bicarbonate, albumin, creatine kinase-MB (mass), creatine kinase, prothrombin time, N-terminal pro–B-type natriuretic peptide, C-reactive protein
	ED consultation and treatment	ED consultation, ED treatment	Services provider, consultation waiting time, ED location, length of consultation, resuscitation, major emergency surgeries, pre-selected major ordering
	ED allergy	ED allergy	Major allergy types and reasons, severity
	ED disposition and diagnosis	ED disposition, ED diagnosis	Disposition type, major primary diagnosis, secondary diagnosis (eg, trauma)
	Outcomes	ED disposition, ED root table	Admissions, mortality within ED, direct transfer to ICU
Information collected within the first 24 hours of inpatient stay
	Inpatient stay patient flow	Inpatient movement	ICU or HDU admission, ward class, duration of ICU or HDU stay, hospital departments, surgeries
	Inpatient vital	Inpatient vital signs	Pulse, respiration, SpO₂, systolic BP, diastolic BP, temperature, Glasgow coma scale, height, weight, BMI
	Inpatient laboratory	Inpatient laboratory	Laboratory measurement frequency and major laboratory test results: albumin, potassium, creatinine, sodium, bicarbonate, creatine kinase, creatine kinase_MB (mass), C-reactive protein, prothrombin time, procalcitonin, blood PH, glycated hemoglobin A1c, triglycerides, cholesterol, high-density lipoprotein cholesterol, low-density lipoprotein cholesterol
	Inpatient treatment	Inpatient treatment and order	Major medication prescription and order
Information collected at discharge
	Health care utilization summary	Inpatient movement	Count of ED visits, ED admissions, surgeries, ICU or HDU admissions last year
	Discharge information	Discharge diagnosis	Primary discharge diagnosis, discharge location, length of stay
	Outcomes	Inpatient movement, discharge diagnosis	ICU transfer, inpatient mortality, cardiac arrest, prolonged hospital length of stay
Information collected after discharge
	Outcomes	ED root table	2/7/30-day mortality, emergency readmission, ED revisit

^aED: emergency department.

^bICU: intensive care unit.

^cHDU: high dependency unit.

^dSpO₂: peripheral oxygen saturation.

^eBP: blood pressure.

Predictive Modelling for Clinical Decision Making

In this study, we will develop and validate a novel interpretable triage system for risk stratification of patients in the ED. Our proposed solution will be compared with baseline risk prediction tools such as traditional clinical scores and black box machine learning models. The extracted data set will be split into training, validation, and testing sets to build and validate the predictive models. The ED visit episodes from January 1, 2008, to December 31, 2018, will be randomly divided into 2 non-overlapping cohorts: a training cohort (80%) and a validation cohort (20%). The ED visits dated in 2019 are assigned to one testing cohort, while those dated in 2020 are assigned to a second testing cohort covering the period of the COVID-19 pandemic [40,41]. Using this sequential testing design, we will be able to test whether the population shift and the COVID-19 pandemic would impact model performance [42]. Further details are presented below.

Proposed Method: Interpretable SERT

SERT consists of 3 scoring algorithms, each tailored to its application at different time points in the ED. On arrival at the triage station, SERT-1 is used to estimate patients’ likelihood of admission (inpatient and ICU) and 2-day mortality. SERT-1 is intended to assess the patient’s immediate urgency based on basic patient information, simple vital measurements, and medical histories readily available during triage. While in the ED, SERT-2 predicts patients’ admission (inpatient and ICU) and 2/7-day mortality using a variety of variables, including laboratory test results, vital signs, ED treatment, diagnosis, and some administrative information. As an extension of the SERT-1 algorithm, SERT-2 incorporates additional variables obtained during ED stay to better predict outcomes. On admission, SERT-3 predicts the likelihood of 7/30-day mortality, ICU transfer, and prolonged length of stay using variables collected in the ED and during the first 24 hours of inpatient stay. In actual clinical implementation, in the case where a patient has incomplete information, SERT will use imputation methods to fill in the missing values before calculating the risk score. In summary, SERT allows for a comprehensive risk assessment and prediction in the ED in a dynamic manner.

The clinical risk-scoring models have been traditionally developed in 2 ways: through expert opinions or consensus and conventional cohort studies. However, both approaches are labor-intensive and are not easy to update over time. Recently, we developed an interpretable machine learning–based automatic clinical score generator, AutoScore, as a practical and universal solution for risk scoring [29]. Using the AutoScore framework, users could seamlessly generate parsimonious risk models (ie, point-based sparse risk scores), thereby supporting automated machine learning solutions in health care [43]. AutoScore comprises 6 modules. In module 1, random forest is used to rank variables in terms of their contribution to modelling. Module 2 categorizes continuous variables to address nonlinearity and facilitate the generation of point-based scores. Module 3 computes scores based on a subset of variables and logistic regression, while module 4 determines the optimal number of variables based on a parsimony plot. Module 5 enables fine-tuning of the cut-off values for categorizing continuous variables for preferable interpretation, and module 6 provides a final performance evaluation. AutoScore is used to develop the 3 SERT scoring algorithms with the candidate variables and the outcomes.

Baseline Methods: Traditional Clinical Scores

Several traditional clinical scores will be calculated for performance comparison with the SERT scores. They are the PACS triage system [32], Modified Early Warning Score [44], National Early Warning Score [45], Rapid Acute Physiology Score [46], Rapid Emergency Medicine Score [47], and Cardiac Arrest Risk Triage [48].

Baseline Methods: Black Box Machine Learning Models

Additionally, several machine learning techniques will be compared as baselines for predictive modelling. Of the many machine learning algorithms, we will apply the following popular ones as examples.

Random forest [49]: As the most commonly used tree-based prediction tool, its R package “RandomForest” will be used for model fitting. The parameters will be selected based on recommendations made in previous literature [50,51], where ntree= 100 and mtry is the principal square root of m (ntree number of trees grown; mtry: number of variables randomly sampled as candidates at each split).

Least absolute shrinkage and selection operator [52]: As a penalized regression technique, it is another popular method used in clinical modelling. It is a regression-based method that employs a regularization process for variable selection to increase the statistical model’s predictive accuracy and interpretability. In our study, its regularization rate will be optimized through 10-fold cross-validation.

Deep learning [53]: As a branch of the machine learning field that uses deep neural networks, deep learning was initially widely adopted for computer vision and image understanding before being used for medical image analysis. More recently, researchers have begun to explore deep learning for EHR analysis [54,55]. We are particularly interested in applying deep learning algorithms for adverse event prediction, drawing on the rich sources of EHR data, as described earlier. Using the PyTorch library, we will construct a long short-term memory network [56]. In addition, a multilayer perceptron [57] will be used in conjunction with long short-term memory to learn nontemporal data.

Model Comparison and Performance Metrics

To evaluate the performance of all predictive models, receiver operating characteristic (ROC) analysis will be conducted on the 2 testing cohorts. An overall measure of predictive performance is represented by the area under the ROC curve. Moreover, we will calculate the measures of diagnostic accuracy, such as sensitivity, specificity, positive predictive value, and negative predictive value. These specific measures are determined by setting thresholds on each ROC curve. To achieve optimal balance between sensitivity and specificity, we will select the cut-off points closest to the plot area’s upper-left corner. The 95% CIs for each model or score will also be reported and compared.

Statistical Analysis

We will perform data analysis using R version 4.0 (R Core Team). When summarizing descriptive results, frequency and percentages are reported for categorical variables, while means and SDs are reported for continuous variables. For categorical variables, the chi-square test or Fisher exact test will be used. For numeric variables, the t test will be applied. Further, univariable and multivariable logistic regressions will be used to identify common risk factors associated with the outcomes.

Results

The raw data have been extracted, and we are currently linking and cleaning the data. In the data extraction process, we included all patients who visited the ED at Singapore General Hospital between January 1, 2008, and December 31, 2020. Patients under the age of 21 years were excluded. If the patients were admitted through the corresponding ED visit, they would be followed throughout their inpatient stay. The data set contains more than 1.8 million ED visit episodes of over 810,000 unique patients. Approximately 650,000 of these ED visits resulted in subsequent hospitalizations. Our findings and modelling results are expected to be published by 2022.

Discussion

This paper presents a protocol designed to leverage large-scale EHRs and advanced machine learning techniques for risk stratification and triage in the ED. Among numerous ED triage and risk prediction scores and tools, our proposed SERT solution is unique and innovative because of its dynamic nature and modelling transparency. This project will build on the success of our previous research on risk modelling with EHRs for patients in the ED [14,16,30].

Significance

The identification of patients’ risk at an early stage allows for better resource allocation. There is particular significance in this point because the instability of vital signs may occur later in the ward, leaving a limited time window for life-saving action or decision making, which can be especially difficult in a busy hospital. Patient groups at high risk should be identified earlier in the ED and, if possible, flagged for more stringent monitoring. Similarly, low-risk patients may require less intensive monitoring and treatment, thereby saving hospital resources. The SERT system that we propose has the potential to provide a feasible solution. This system allows medical personnel to assess patient risk at multiple decision points based on various clinical and nonclinical factors. In a dynamic way, SERT measures risk sequentially and in a manner that is perfectly suited to actual clinical needs.

Strengths

First, this study uses a large set of EHR data over a 13-year period, which contains comprehensive patient information. As Singapore’s largest hospital, Singapore General Hospital provides medical care to a wide range of patients throughout the country; thus, its EHRs ensure good coverage for a large population. Additionally, the longitudinal data allow us to validate the SERT system using data before and after COVID-19. Thus, we will have the opportunity to evaluate the impact of the global pandemic on triage performance in the ED. The insights gained from system evaluation could be used to examine possible model adaptations in shifted clinical settings.

Second, the SERT triaging system we intend to develop will be transparent and easily understandable. All 3 SERT scores are parsimonious and point-based, as only the most significant variables are considered in their formulation. Their formats follow the same convention as widely used clinical scores such as the National Early Warning Score and Modified Early Warning Score, allowing for easy comprehension and quick adoption. In contrast, black box machine learning models are challenging to comprehend, making them inaccessible to clinicians [25]. Although there are techniques for post hoc model explanation, most machine learning models are not inherently interpretable [25].

Third, this project aims to develop a dynamic system capable of identifying risk strata at different decision points in the ED. During the initial triage process and the patient’s stay in the ED, SERT predicts the likelihood of inpatient and ICU admissions. Whenever variables are altered, the scores can be updated, making the risk assessment dynamic and practical. In addition, SERT can make mortality predictions to assess the likelihood of the worst outcomes for patients who will be hospitalized.

Lastly, the simple form of the scores in SERT permits a variety of implementation schemes. As an example, the actual implementation can be as simple as a mobile app. Users may input relevant information into the app, which will return a risk score at the time of inquiry. The SERT scoring platform can also be easily integrated into existing information technology systems, which requires only simple calculations and therefore little computing power. The application can be designed and implemented in real time, similar to that seen in a recent study in the United States [13].

Limitations and Future Plan

Although the study site is the largest hospital in the country, the SERT system may not apply to international institutions where EDs operate differently. We intend to conduct cross-institutional validation of our system with both local and international partners. In the case that our SERT system is not feasible, the methods we use can easily be adapted to any context because AutoScore is a generic, universal scoring tool that permits the creation of interpretable clinical scores. In addition, we anticipate a sparse data set with numerous missing values, particularly for comorbidities, medications, and time series records of vitals and laboratory test results. To address the issue of data sparsity, we will examine various data imputation strategies and feature representation techniques.

Our future efforts will include identifying opportunities to conduct a rigorously designed randomized trial to evaluate the system. In the long-term, we hope to expand the evaluation to a multicenter trial involving several countries.

Conclusions

Clinical decision making has widely benefited from the use of machine learning techniques. However, the black box models created by these methods prevent their use in actual clinical practice. Our study aims to address this issue by proposing an innovative SERT scoring system. An interpretable machine learning–based AutoScore framework will be used to create a series of 3 SERT scores that can be used in the medical setting at various decision points throughout the patient’s journey. The SERT system is notable for its dynamic nature and transparency. If validated successfully, it will establish a standard for data processing and modelling by utilizing large-scale EHRs and interpretable machine learning. The proposed system may be well suited to bridge the gap between advanced computation and clinical applications.

Abbreviations

artificial intelligence

emergency department

EHR

electronic health record

ICU

intensive care unit

PACS

Patient Acuity Category Scale

ROC

receiver operating characteristic

SERT

System for Emergency Risk Triage

This study is supported by Duke-NUS Medical School, Singapore. The funder had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.

NL and MEHO are Scientific Advisors of TIIM Healthcare PTe Ltd, a startup with solutions in medical triaging. All other authors have no conflicts of interest to declare.

Lowthian

Curtis

Cameron

Stoelwinder

Cooke

McNeil

Systematic review of trends in emergency department attendances: an Australian perspective

Emerg Med J 2011 05 28 5 373 7

10.1136/emj.2010.099226

20961936

emj.2010.099226

Derlet

Overcrowding in emergency departments: increased demand and decreased capacity

Ann Emerg Med 2002 04 39 4 430 2

10.1067/mem.2002.122707

11919530

S0196064402290770

Derlet

Richards

Overcrowding in the nation's emergency departments: complex causes and disturbing effects

Ann Emerg Med 2000 01 35 1 63 8

10.1016/s0196-0644(00)70105-3

10613941

S0196064400097729

Hoot

Nathan R

Aronsky

Dominik

Systematic review of emergency department crowding: causes, effects, and solutions

Ann Emerg Med 2008 08 52 2 126 36

10.1016/j.annemergmed.2008.03.014

18433933

S0196-0644(08)00606-9

PMC7340358

Di Somma

Paladino

Vaughan

Lalle

Magrini

Magnanti

Overcrowding in emergency department: an international issue

Intern Emerg Med 2015 03 10 2 171 5

10.1007/s11739-014-1154-8

25446540

Singer

Thode

Viccellio

Pines

The association between length of emergency department boarding and mortality

Acad Emerg Med 2011 12 18 12 1324 9

10.1111/j.1553-2712.2011.01236.x

22168198

Sabbatini

Kocher

Basu

Hsia

In-hospital outcomes and costs among patients hospitalized during a return visit to the emergency department

JAMA 2016 02 16 315 7 663 71

10.1001/jama.2016.0649

26881369

2491638

PMC8366576

Carter

Chochinov

A systematic review of the impact of nurse practitioners on cost, quality of care, satisfaction and wait times in the emergency department

CJEM 2007 07 21 9 4 286 95

10.1017/s1481803500015189

17626694

8fc58a6652bb485b9958aefe06b4296a

FitzGerald

Jelinek

Scott

Gerdtz

Emergency department triage revisited

Emerg Med J 2010 02 27 2 86 92

10.1136/emj.2009.077081

20156855

27/2/86

Wuerz

Milne

Eitel

Travers

Gilboy

Reliability and validity of a new five-level triage instrument

Acad Emerg Med 2000 03 7 3 236 42

10.1111/j.1553-2712.2000.tb01066.x

10730830

Considine

LeVasseur

Villanueva

The Australasian Triage Scale: examining emergency department nurses' performance using computer and paper scenarios

Ann Emerg Med 2004 11 44 5 516 23

10.1016/j.annemergmed.2004.04.007

15520712

S0196064404004159

Fong

Ru Ying

Glen

Wee Sern Sim

Mohamed Jamil

Ahmad Khairil

Tam

Wilson Wai San

Kowitlawakul

Yanika

Comparison of the Emergency Severity Index versus the Patient Acuity Category Scale in an emergency setting

Int Emerg Nurs 2018 11 41 13 18

10.1016/j.ienj.2018.05.001

29887281

S1755-599X(18)30064-8

Fenn

Davis

Buckland

Kapadia

Nichols

Gao

Knechtle

William

Balu

Suresh

Sendak

Mark

Theiling

B Jason

Development and validation of machine learning models to predict admission from emergency department to inpatient and intensive care units

Ann Emerg Med 2021 08 78 2 290 302

10.1016/j.annemergmed.2021.02.029

33972128

S0196-0644(21)00161-X

Parker

Liu

Shen

Lam

SSW

Ong

MEH

Predicting hospital admission at the emergency department triage: A novel prediction model

Am J Emerg Med 2019 08 37 8 1498 1504

10.1016/j.ajem.2018.10.060

30413365

S0735-6757(18)30891-X

Sun

Heng

Tay

Seow

Predicting hospital admissions at emergency department triage using routine administrative data

Acad Emerg Med 2011 08 18 8 844 50

10.1111/j.1553-2712.2011.01125.x

21843220

Xie

Liu

Ang

Low

AFW

Lam

SSW

Matchar

Ong

MEH

Chakraborty

Novel model for predicting inpatient mortality after emergency admission to hospital in Singapore: retrospective observational study

BMJ Open 2019 09 26 9 9 e031382

10.1136/bmjopen-2019-031382

31558458

bmjopen-2019-031382

PMC6773418

Jang

Kim

Lee

Hwang

Park

Lee

Dong Keon

Park

Inwon

Kim

Doyun

Chang

Hyunglan

Developing neural network models for early detection of cardiac arrest in emergency department

Am J Emerg Med 2020 01 38 1 43 49

10.1016/j.ajem.2019.04.006

30982559

S0735-6757(19)30226-8

Raita

Goto

Faridi

Brown

DFM

Camargo

Hasegawa

Emergency department triage prediction of clinical outcomes using machine learning models

Crit Care 2019 02 22 23 1 64

10.1186/s13054-019-2351-7

30795786

10.1186/s13054-019-2351-7

PMC6387562

Kirby

Dennis

Jayasinghe

Harris

Patient related factors in frequent readmissions: the influence of condition, access to services and patient choice

BMC Health Serv Res 2010 07 21 10 216

10.1186/1472-6963-10-216

20663141

1472-6963-10-216

PMC2918597

Sun

Hsia

Weiss

Zingmond

Liang

Han

McCreath

Asch

Effect of emergency department crowding on outcomes of admitted patients

Ann Emerg Med 2013 06 61 6 605 611.e6

10.1016/j.annemergmed.2012.10.026

23218508

S0196-0644(12)01699-X

PMC3690784

Ball

Murrells

Rafferty

Morrow

Griffiths

'Care left undone' during nursing shifts: associations with workload and perceived quality of care

BMJ Qual Saf 2014 02 23 2 116 25

10.1136/bmjqs-2012-001767

23898215

bmjqs-2012-001767

PMC3913111

Teubner

Considine

Hakendorf

Kim

Bersten

Model to predict inpatient mortality from information gathered at presentation to an emergency department: The Triage Information Mortality Model (TIMM)

Emerg Med Australas 2015 08 27 4 300 6

10.1111/1742-6723.12425

26147765

Abraham

Fonarow

Albert

Stough

Gheorghiade

Greenberg

O'Connor

Sun

Yancy

Young

OPTIMIZE-HF InvestigatorsCoordinators

Predictors of in-hospital mortality in patients hospitalized for heart failure: insights from the Organized Program to Initiate Lifesaving Treatment in Hospitalized Patients with Heart Failure (OPTIMIZE-HF)

J Am Coll Cardiol 2008 07 29 52 5 347 56

10.1016/j.jacc.2008.04.028

18652942

S0735-1097(08)01672-0

Fernandes

Vieira

Leite

Palos

Finkelstein

Sousa

Clinical decision support systems for triage in the emergency department using intelligent systems: a review

Artif Intell Med 2020 01 102 101762

10.1016/j.artmed.2019.101762

31980099

S0933-3657(19)30126-5

Rudin

Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead

Nat Mach Intell 2019 5 13 1 5 206 215

10.1038/s42256-019-0048-x

Khairat

Marc

Crosby

Al Sanousi

Ali

Reasons for physicians not adopting clinical decision support systems: critical analysis

JMIR Med Inform 2018 04 18 6 2 e24

10.2196/medinform.8912

29669706

v6i2e24

PMC5932331

Ustun

Rudin

Supersparse linear integer models for optimized medical scoring systems

Mach Learn 2015 11 5 102 3 349 391

10.1007/s10994-015-5528-6

Ustun

Rudin

Learning Optimized Risk Scores

2017

KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

August 2017

Halifax, NS, Canada

1125 1134

10.1145/3097983.3098161

Xie

Chakraborty

Ong

MEH

Goldstein

Liu

AutoScore: a machine learning-based automatic clinical score generator and its application to mortality prediction using electronic health records

JMIR Med Inform 2020 10 21 8 10 e21798

10.2196/21798

33084589

v8i10e21798

PMC7641783

Xie

Ong

MEH

Liew

JNMH

Tan

KBK

AFW

Nadarajan

Low

Kwan

Goldstein

Matchar

Chakraborty

Liu

Development and assessment of an interpretable machine learning triage tool for estimating mortality after emergency admissions

JAMA Netw Open 2021 08 02 4 8 e2118467

10.1001/jamanetworkopen.2021.18467

34448870

2783549

PMC8397930

Shen

Tay

Teo

EWK

Liu

Lam

Ong

MEH

Association between the elderly frequent attender to the emergency department and 30-day mortality: A retrospective study over 10 years

World J Emerg Med 2018 9 1 20 25

10.5847/wjem.j.1920-8642.2018.01.003

29290891

WJEM-9-20

PMC5717371

Fong

Glen

WSS

Mohamed Jamil

Tam

WWS

Kowitlawakul

Comparison of the Emergency Severity Index versus the Patient Acuity Category Scale in an emergency setting

Int Emerg Nurs 2018 11 41 13 18

10.1016/j.ienj.2018.05.001

29887281

S1755-599X(18)30064-8

Electronic Health Intelligence System

IHIS 2022-03-03

https://www.ihis.com.sg/Project_Showcase/Healthcare_Systems/Pages/eHINTS.aspx

Dickson

Dewar

Richardson

Hunter

Searle

Hodgson

Agreement and validity of electronic patient self-triage (eTriage) with nurse triage in two UK emergency departments: a retrospective study

Eur J Emerg Med 2022 02 01 29 1 49 55

10.1097/MEJ.0000000000000863

34545027

00063110-900000000-99010

Levin

Toerper

Hamrock

Hinson

Barnes

Gardner

Dugas

Linton

Kirsch

Kelen

Machine-learning-based electronic triage more accurately differentiates patients with respect to clinical outcomes compared with the Emergency Severity Index

Ann Emerg Med 2018 05 71 5 565 574.e2

10.1016/j.annemergmed.2017.08.005

28888332

S0196-0644(17)31442-7

Dugas

Kirsch

Toerper

Korley

Yenokyan

France

Hager

Levin

An electronic emergency triage system to improve patient distribution by critical outcomes

J Emerg Med 2016 06 50 6 910 8

10.1016/j.jemermed.2016.02.026

27133736

S0736-4679(16)00152-9

Cameron

Rodgers

Ireland

Jamdar

McKay

A simple tool to predict admission at the time of triage

Emerg Med J 2015 03 13 32 3 174 9

10.1136/emermed-2013-203200

24421344

emermed-2013-203200

PMC4345772

Kraaijvanger

Rijpsma

Roovers

van Leeuwen

Kaasjager

van den Brand

Horstink

Edwards

Development and validation of an admission prediction tool for emergency departments in the Netherlands

Emerg Med J 2018 08 35 8 464 470

10.1136/emermed-2017-206673

29627769

emermed-2017-206673

Mowbray

Zargoush

Jones

de Wit

Costa

Predicting hospital admission for older emergency department patients: insights from machine learning

Int J Med Inform 2020 08 140 104163

10.1016/j.ijmedinf.2020.104163

32474393

S1386-5056(19)31451-0

Jeffery

D'Onofrio

Gail

Paek

Platts-Mills

Soares

Hoppe

Genes

Nath

Melnick

Trends in emergency department visits and hospital admissions in health care systems in 5 states in the first months of the COVID-19 pandemic in the US

JAMA Intern Med 2020 10 01 180 10 1328 1333

10.1001/jamainternmed.2020.3288

32744612

2768777

PMC7400214

Liu

Chee

Niu

Pek

Siddiqui

Ansah

Matchar

David Bruce

Lam

Sean Shao Wei

Abdullah

Hairil Rizal

Chan

Angelique

Malhotra

Rahul

Graves

Nicholas

Koh

Mariko Siyue

Yoon

Sungwon

Andrew Fu Wah

Ting

Daniel Shu Wei

Low

Jenny Guek Hong

Ong

Marcus Eng Hock

Coronavirus disease 2019 (COVID-19): an evidence map of medical literature

BMC Med Res Methodol 2020 07 02 20 1 177

10.1186/s12874-020-01059-y

32615936

10.1186/s12874-020-01059-y

PMC7330264

Chee

Ong

MEH

Siddiqui

Zhang

Lim

AFW

Liu

Artificial intelligence applications for COVID-19 in intensive care and emergency settings: a systematic review

Int J Environ Res Public Health 2021 04 29 18 9 4749

10.3390/ijerph18094749

33947006

ijerph18094749

PMC8125462

Waring

Lindvall

Umeton

Automated machine learning: review of the state-of-the-art and opportunities for healthcare

Artif Intell Med 2020 04 104 101822

10.1016/j.artmed.2020.101822

32499001

S0933-3657(19)31043-7

Subbe

Kruger

Rutherford

Gemmel

Validation of a modified Early Warning Score in medical admissions

QJM 2001 10 94 10 521 6

10.1093/qjmed/94.10.521

11588210

Smith

Gary B

Redfern

Oliver C

Pimentel

Marco Af

Gerry

Stephen

Collins

Gary S

Malycha

James

Prytherch

David

Schmidt

Paul E

Watkinson

Peter J

The National Early Warning Score 2 (NEWS2)

Clin Med (Lond) 2019 05 19 3 260

10.7861/clinmedicine.19-3-260

31092526

19/3/260

PMC6542226

Rhee

Fisher

Willitis

The Rapid Acute Physiology Score

Am J Emerg Med 1987 07 5 4 278 82

10.1016/0735-6757(87)90350-0

3593492

0735-6757(87)90350-0

Olsson

Terent

Lind

Rapid Emergency Medicine score: a new prognostic tool for in-hospital mortality in nonsurgical emergency department patients

J Intern Med 2004 05 255 5 579 87

10.1111/j.1365-2796.2004.01321.x

15078500

JIM1321

Churpek

Yuen

Park

Meltzer

Hall

Edelson

Derivation of a cardiac arrest prediction model using ward vital signs*

Crit Care Med 2012 07 40 7 2102 8

10.1097/CCM.0b013e318250aa5a

22584764

PMC3378796

Breiman

Random Forests

Machine Learning 2001 45 1 5 32

10.1023/a:1010933404324

Oshiro

Perez

Baranauskas

How Many Trees in a Random Forest?

Machine Learning and Data Mining in Pattern Recognition 2012

Berlin, Heidelberg

Springer

Probst

Boulesteix

To tune or not to tune the number of trees in random forest

J Mach Learn Res 2017 18 1 6673 90

Tibshirani

Regression Shrinkage and Selection Via the Lasso

J R Stat Soc Series B Stat Methodol 2018 12 05 58 1 267 288

10.1111/j.2517-6161.1996.tb02080.x

LeCun

Bengio

Hinton

Deep learning

Nature 2015 05 28 521 7553 436 44

10.1038/nature14539

26017442

nature14539

Rajkomar

Oren

Chen

Dai

Hajaj

Hardt

Liu

Marcus

Sun

Sundberg

Yee

Zhang

Flores

Duggan

Irvine

Litsch

Mossin

Tansuwan

Wang

Wexler

Wilson

Ludwig

Volchenboum

Chou

Pearson

Madabushi

Shah

Butte

Howell

Cui

Corrado

Dean

Scalable and accurate deep learning with electronic health records

NPJ Digit Med 2018 1 18

10.1038/s41746-018-0029-1

31304302

PMC6550175

Xiao

Choi

Sun

Opportunities and challenges in developing deep learning models using electronic health records data: a systematic review

J Am Med Inform Assoc 2018 10 01 25 10 1419 1428

10.1093/jamia/ocy068

29893864

5035024

PMC6188527

Sherstinsky

Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) network

Physica D: Nonlinear Phenomena 2020 03 404 132306

10.1016/j.physd.2019.132306

Murtagh

Multilayer perceptrons for classification and regression

Neurocomputing 1991 7 2 5-6 183 197

10.1016/0925-2312(91)90023-5