We have located links that may give you full text access.
GPT-4 Artificial Intelligence Model Outperforms ChatGPT, Medical Students, and Neurosurgery Residents on Neurosurgery Written Board-Like Questions.
World Neurosurgery 2023 November
BACKGROUND: Artificial intelligence (AI) and machine learning have transformed health care with applications in various specialized fields. Neurosurgery can benefit from artificial intelligence in surgical planning, predicting patient outcomes, and analyzing neuroimaging data. GPT-4, an updated language model with additional training parameters, has exhibited exceptional performance on standardized exams. This study examines GPT-4's competence on neurosurgical board-style questions, comparing its performance with medical students and residents, to explore its potential in medical education and clinical decision-making.
METHODS: GPT-4's performance was examined on 643 Congress of Neurological Surgeons Self-Assessment Neurosurgery Exam (SANS) board-style questions from various neurosurgery subspecialties. Of these, 477 were text-based and 166 contained images. GPT-4 refused to answer 52 questions that contained no text. The remaining 591 questions were inputted into GPT-4, and its performance was evaluated based on first-time responses. Raw scores were analyzed across subspecialties and question types, and then compared to previous findings on Chat Generative pre-trained transformer performance against SANS users, medical students, and neurosurgery residents.
RESULTS: GPT-4 attempted 91.9% of Congress of Neurological Surgeons SANS questions and achieved 76.6% accuracy. The model's accuracy increased to 79.0% for text-only questions. GPT-4 outperformed Chat Generative pre-trained transformer (P < 0.001) and scored highest in pain/peripheral nerve (84%) and lowest in spine (73%) categories. It exceeded the performance of medical students (26.3%), neurosurgery residents (61.5%), and the national average of SANS users (69.3%) across all categories.
CONCLUSIONS: GPT-4 significantly outperformed medical students, neurosurgery residents, and the national average of SANS users. The mode's accuracy suggests potential applications in educational settings and clinical decision-making, enhancing provider efficiency, and improving patient care.
METHODS: GPT-4's performance was examined on 643 Congress of Neurological Surgeons Self-Assessment Neurosurgery Exam (SANS) board-style questions from various neurosurgery subspecialties. Of these, 477 were text-based and 166 contained images. GPT-4 refused to answer 52 questions that contained no text. The remaining 591 questions were inputted into GPT-4, and its performance was evaluated based on first-time responses. Raw scores were analyzed across subspecialties and question types, and then compared to previous findings on Chat Generative pre-trained transformer performance against SANS users, medical students, and neurosurgery residents.
RESULTS: GPT-4 attempted 91.9% of Congress of Neurological Surgeons SANS questions and achieved 76.6% accuracy. The model's accuracy increased to 79.0% for text-only questions. GPT-4 outperformed Chat Generative pre-trained transformer (P < 0.001) and scored highest in pain/peripheral nerve (84%) and lowest in spine (73%) categories. It exceeded the performance of medical students (26.3%), neurosurgery residents (61.5%), and the national average of SANS users (69.3%) across all categories.
CONCLUSIONS: GPT-4 significantly outperformed medical students, neurosurgery residents, and the national average of SANS users. The mode's accuracy suggests potential applications in educational settings and clinical decision-making, enhancing provider efficiency, and improving patient care.
Full text links
Related Resources
Trending Papers
Renin-Angiotensin-Aldosterone System: From History to Practice of a Secular Topic.International Journal of Molecular Sciences 2024 April 5
Prevention and treatment of ischaemic and haemorrhagic stroke in people with diabetes mellitus: a focus on glucose control and comorbidities.Diabetologia 2024 April 17
British Society for Rheumatology guideline on management of adult and juvenile onset Sjögren disease.Rheumatology 2024 April 17
Albumin: a comprehensive review and practical guideline for clinical use.European Journal of Clinical Pharmacology 2024 April 13
Get seemless 1-tap access through your institution/university
For the best experience, use the Read mobile app
All material on this website is protected by copyright, Copyright © 1994-2024 by WebMD LLC.
This website also contains material copyrighted by 3rd parties.
By using this service, you agree to our terms of use and privacy policy.
Your Privacy Choices
You can now claim free CME credits for this literature searchClaim now
Get seemless 1-tap access through your institution/university
For the best experience, use the Read mobile app