Safety and accuracy of AI in triaging patients
in the emergency department
Dublin Core
Title
Safety and accuracy of AI in triaging patients
in the emergency department
in the emergency department
Subject
Emergency department, AI, Triage, ChatGPT
Description
Abstract
Background Artificial Intelligence (AI) has been increasingly explored in healthcare, particularly in emergency
department (ED) triage. This study aimed to evaluate the effectiveness of the AI chatbot ChatGPT in triaging patients,
focusing on its accuracy, safety, efficiency, and impact on patient care.
Methods A prospective observational study was conducted at the ED of King Saud Medical City (KSMC) in Riyadh,
Saudi Arabia, with a sample size of 138 patients. Patients requiring immediate resuscitation were excluded. ED
physicians assigned triage scores using the Canadian Triage and Acuity Scale (CTAS), followed by AI-generated scores
for the same patients. In cases of discrepancy, the final decision by the senior ED consultant was considered the gold
standard. The study assessed inter-rater reliability between AI and human raters and evaluated the accuracy of each
compared to the consultant’s assessment.
Results The results indicated a high agreement rate (85.61%) between ChatGPT and ED physicians, with substantial
inter-rater reliability (κ=0.780, 95% Confidence Interval [CI] 0.676–0.884, p<0.001). Agreement between ED physicians
and consultants was at 63.9%, with moderate reliability (κ=0.406, 95% CI 0.006–0.806, p=0.018). Consultants assigned
lower acuity levels than physicians in most cases. ChatGPT’s accuracy compared to the consultant was 42.86%, with
slight reliability, showing a tendency to overestimate acuity, particularly in critical cases. However, it performed better
in mid-range acuity levels.
Conclusion The findings suggested that AI could support ED triage by aligning closely with human decision-making.
However, its overestimation of severity could lead to over-triaging and increased resource use. Limitations included a
small sample size and the use of a general AI model not specifically trained for medical triage. Future research should
focus on AI models tailored for ED triage to improve reliability and clinical applicability.
Keywords Emergency department, AI, Triage, ChatGPT
Background Artificial Intelligence (AI) has been increasingly explored in healthcare, particularly in emergency
department (ED) triage. This study aimed to evaluate the effectiveness of the AI chatbot ChatGPT in triaging patients,
focusing on its accuracy, safety, efficiency, and impact on patient care.
Methods A prospective observational study was conducted at the ED of King Saud Medical City (KSMC) in Riyadh,
Saudi Arabia, with a sample size of 138 patients. Patients requiring immediate resuscitation were excluded. ED
physicians assigned triage scores using the Canadian Triage and Acuity Scale (CTAS), followed by AI-generated scores
for the same patients. In cases of discrepancy, the final decision by the senior ED consultant was considered the gold
standard. The study assessed inter-rater reliability between AI and human raters and evaluated the accuracy of each
compared to the consultant’s assessment.
Results The results indicated a high agreement rate (85.61%) between ChatGPT and ED physicians, with substantial
inter-rater reliability (κ=0.780, 95% Confidence Interval [CI] 0.676–0.884, p<0.001). Agreement between ED physicians
and consultants was at 63.9%, with moderate reliability (κ=0.406, 95% CI 0.006–0.806, p=0.018). Consultants assigned
lower acuity levels than physicians in most cases. ChatGPT’s accuracy compared to the consultant was 42.86%, with
slight reliability, showing a tendency to overestimate acuity, particularly in critical cases. However, it performed better
in mid-range acuity levels.
Conclusion The findings suggested that AI could support ED triage by aligning closely with human decision-making.
However, its overestimation of severity could lead to over-triaging and increased resource use. Limitations included a
small sample size and the use of a general AI model not specifically trained for medical triage. Future research should
focus on AI models tailored for ED triage to improve reliability and clinical applicability.
Keywords Emergency department, AI, Triage, ChatGPT
Creator
Lama Mohammad Alomari1 , Mai Mamdouh Alshammari1 , Asal Osama Arbaeen1 , Raghad Abdullah Alshehri2
and Hanin Saad Almalki3*
and Hanin Saad Almalki3*
Source
https://doi.org/10.1186/s12245-025-01069-x
Date
2025
Contributor
Peri Irawan
Format
PDF
Language
ENGLISH
Type
TEXT
Files
Collection
Citation
Lama Mohammad Alomari1 , Mai Mamdouh Alshammari1 , Asal Osama Arbaeen1 , Raghad Abdullah Alshehri2
and Hanin Saad Almalki3*, “Safety and accuracy of AI in triaging patients
in the emergency department,” Repository Horizon University Indonesia, accessed April 13, 2026, https://repository.horizon.ac.id/items/show/12909.
in the emergency department,” Repository Horizon University Indonesia, accessed April 13, 2026, https://repository.horizon.ac.id/items/show/12909.