Performance of a large language model on Japanese emergency medicine board certification examinations.

Yutaka Igarashi, Kyoichi Nakahara, Tatsuya Norii, Nodoka Miyake, Takashi Tagami, Shoji Yokobori

Journal of Nippon Medical School 2024 March 3

Background Emergency physicians need a broad range of knowledge and skills to address critical medical, traumatic, and environmental conditions. Artificial intelligence (AI), including large language models (LLMs), has potential applications in healthcare settings; however, the performance of LLMs in emergency medicine remains unclear.Methods To evaluate the reliability of information provided by ChatGPT, an LLM was given the questions set by the Japanese Association of Acute Medicine in its board certification examinations over a period of 5 years (2018-2022) and programmed to answer them twice. Statistical analysis was used to assess agreement of the two responses.Results The LLM successfully answered 465 of the 475 text-based questions, achieving an overall correct response rate of 62.3%. For questions without images, the rate of correct answers was 65.9%. For questions with images that were not explained to the LLM, the rate of correct answers was only 52.0%. The annual rates of correct answers to questions without images ranged from 56.3% to 78.8%. Accuracy was better for scenario-based questions (69.1%) than for stand-alone questions (62.1%). Agreement between the two responses was substantial (kappa = 0.70). Factual error accounted for 82% of the incorrectly answered questions.Conclusion An LLM performed satisfactorily on an emergency medicine board certification examination in Japanese and without images. However, factual errors in the responses highlight the need for physician oversight when using LLMs.

Full text links

We have located links that may give you full text access.

Show additional links to paperHide additional links to paper

PubMed

Add to Saved Papers

Get 1-tap access

Related Resources

A Guide to the Use of Vasopressors and Inotropes for Patients in Shock.Anaas Moncef Mergoum et al.Journal of Intensive Care Medicine 2024 April 14

British Society for Rheumatology guideline on management of adult and juvenile onset Sjögren disease.Elizabeth J Price et al.Rheumatology 2024 April 17

Albumin: a comprehensive review and practical guideline for clinical use.Farshad Abedi, Batool Zarei, Sepideh ElyasiEuropean Journal of Clinical Pharmacology 2024 April 13

Renin-Angiotensin-Aldosterone System: From History to Practice of a Secular Topic.Sara H Ksiazek et al.International Journal of Molecular Sciences 2024 April 5

British Society of Gastroenterology guidelines for the management of hepatocellular carcinoma in adults.Abid Suddle et al.Gut 2024 April 17

Eosinophilic Esophagitis: Clinical Pearls for Primary Care Providers and Gastroenterologists.Rohit Goyal, Amrit K Kamboj, Diana L SnyderMayo Clinic Proceedings 2024 April

For the best experience, use the Read mobile app

Get seemless 1-tap access through your institution/university

For the best experience, use the Read mobile app

All material on this website is protected by copyright, Copyright © 1994-2024 by WebMD LLC.
This website also contains material copyrighted by 3rd parties.

By using this service, you agree to our terms of use and privacy policy.

Your Privacy Choices

You can now claim free CME credits for this literature searchClaim now

Get seemless 1-tap access through your institution/university

For the best experience, use the Read mobile app

Performance of a large language model on Japanese emergency medicine board certification examinations.

Full text links

Related Resources

Trending Papers

For the best experience, use the Read mobile app