Ethan Goh, Robert Gallo, Jason Hom, Eric Strong, Yingjie Weng, Hannah Kerman, Josephine Cool, Zahir Kanjee, Andrew S Parsons, Neera Ahuja, Eric Horvitz, Daniel Yang, Arnold Milstein, Andrew P J Olson, Adam Rodman, Jonathan H Chen
IMPORTANCE: Diagnostic errors are common and cause significant morbidity. Large language models (LLMs) have shown promise in their performance on both multiple-choice and open-ended medical reasoning examinations, but it remains unknown whether the use of such tools improves diagnostic reasoning. OBJECTIVE: To assess the impact of the GPT-4 LLM on physicians' diagnostic reasoning compared to conventional resources. DESIGN: Multi-center, randomized clinical vignette study...
March 14, 2024: medRxiv