Performance of OpenAI’s GPT-4 in a mock MRCS Part A Examination

Ibrahim Haq; Siddarth Raj; Ali Ridha; Arun O'Sullivan; Imran Ahmed; Farhan Syed; Chetan Khatri

doi:10.62463/surgery.89

Authors

Ibrahim Haq University Hospitals Coventry and Warwickshire
Siddarth Raj University Hospitals Coventry and Warwickshire
Ali Ridha University Hospitals Coventry and Warwickshire
Arun O'Sullivan University Hospitals Coventry and Warwickshire
Imran Ahmed University Hospitals Coventry and Warwickshire
Farhan Syed University Hospitals Coventry and Warwickshire
Chetan Khatri University Hospitals Coventry and Warwickshire

DOI:

https://doi.org/10.62463/surgery.89

Keywords:

AI, medical education, ChatGPT, mrcs

Abstract

Introduction:

OpenAI’s latest iteration of a Large Language Model (LLM); GPT-4 (Generative Pre-trained Transformer 4) has demonstrated its proficiency against various professional examination standards like the USMLE (United States Medical Licensing Examination), FRCS (Fellowship of the Royal Colleges of Surgeons) and the United States Bar. However, GPT-4’s capability with the MRCS (Membership of the Royal College of Surgeons) Part A has not yet been investigated.

Methodology:

A representative MRCS Part A examination that was prepared and provided by “TeachMeSurgery” based on the MRCS Intercollegiate Curriculum was used to assess GPT-4's performance. Each question was processed via the web-based interface of ChatGPT (Chat Generative Pre-trained Transformer) Plus.

Results:

GPT-4 scored 87.2% on Applied Basic Sciences (157/180) and 86.7% on Principles of Surgery in General (104/120 questions), achieving an overall score of 261/300 (87%), which is above the typical passing threshold. GPT-4 scored 100% in four out of the eleven predefined curriculum areas, which included: Pharmacology, Microbiology, Data Interpretation and Audit, and The Surgical Care of Children. GPT-4’s weakest performance was in the Medico-Legal Aspects of Surgical Practice, in which it scored 33.3%.

Conclusion:

GPT-4 successfully passed the mock MRCS Part A without any specialised preparatory training, however further research could look at integrating the GPT-4 model to enhance a trainee surgeon’s learning and use it as an effective tool to deliver high quality & efficient patient care.

Performance of OpenAI’s GPT-4 in a mock MRCS Part A Examination

Authors

DOI:

Keywords:

Abstract

Downloads

Additional Files

Published

How to Cite

Issue

Section

License

Make a Submission

Information