Generative AI’s performance on emergency medicine boards questions: an observational study

Loading...
Thumbnail Image
Date
2025
Authors
Kajitani, Sten
Pastrak, Mila
Goodings, Anthony
Nguyen, Audrey
Drewek, Austin
Lafree, Andrew
Murphy, Adrian
Journal Title
Journal ISSN
Volume Title
Publisher
UCC Medical Research and Technology Society
Published Version
Research Projects
Organizational Units
Journal Issue
Abstract
Background: The evolving field of medicine has introduced ChatGPT as a potential assistive platform, though its use in medical board exam preparation remains debated [1-2]. This study aimed to evaluate the performance of a custom-modified version of ChatGPT-4, tailored with emergency medicine board exam preparatory materials (Anki deck), compared to its default version and previous iteration (3.5) [3]. The goal was to assess the accuracy of ChatGPT-4 answering board- style questions and its suitability as a tool for medical education. Methods: A comparative analysis was conducted using a random selection of 598 questions from the Rosh In-Training Exam Question Bank [4]. The subjects of the study included three versions of ChatGPT: the Default, a Custom, and ChatGPT-3.5. Accuracy, response length, medical discipline subgroups, and underlying causes of error were analyzed. Results: Custom ChatGPT-4 did not significantly improve accuracy over Default (p>0.05), but both significantly outperformed ChatGPT-3.5 (p<
Description
Keywords
Generative AI , Emergency medicine boards questions
Citation
Kajitani, S. Pastrak, M., Goodings, A., Nguyen, A., Drewek, A., Lafree, A. and Murphy, A. (2025) 'Generative AI’s performance on emergency medicine boards questions: an observational study', UCC Student Medical Journal, 5, p. 113. https://doi.org/10.33178/SMJ.2025.1.39
Link to publisher’s version