Automated Extraction of Diagnostic Criteria From Electronic Health Records for Autism Spectrum Disorders: Development, Evaluation, and Application.

Gondy Leroy, Yang Gu, Sydney Pettygrove, Maureen K Galindo, Ananyaa Arora, Margaret Kurzius-Spencer

Journal of Medical Internet Research 2018 November 8

BACKGROUND: Electronic health records (EHRs) bring many opportunities for information utilization. One such use is the surveillance conducted by the Centers for Disease Control and Prevention to track cases of autism spectrum disorder (ASD). This process currently comprises manual collection and review of EHRs of 4- and 8-year old children in 11 US states for the presence of ASD criteria. The work is time-consuming and expensive.

OBJECTIVE: Our objective was to automatically extract from EHRs the description of behaviors noted by the clinicians in evidence of the diagnostic criteria in the Diagnostic and Statistical Manual of Mental Disorders (DSM). Previously, we reported on the classification of entire EHRs as ASD or not. In this work, we focus on the extraction of individual expressions of the different ASD criteria in the text. We intend to facilitate large-scale surveillance efforts for ASD and support analysis of changes over time as well as enable integration with other relevant data.

METHODS: We developed a natural language processing (NLP) parser to extract expressions of 12 DSM criteria using 104 patterns and 92 lexicons (1787 terms). The parser is rule-based to enable precise extraction of the entities from the text. The entities themselves are encompassed in the EHRs as very diverse expressions of the diagnostic criteria written by different people at different times (clinicians, speech pathologists, among others). Due to the sparsity of the data, a rule-based approach is best suited until larger datasets can be generated for machine learning algorithms.

RESULTS: We evaluated our rule-based parser and compared it with a machine learning baseline (decision tree). Using a test set of 6636 sentences (50 EHRs), we found that our parser achieved 76% precision, 43% recall (ie, sensitivity), and >99% specificity for criterion extraction. The performance was better for the rule-based approach than for the machine learning baseline (60% precision and 30% recall). For some individual criteria, precision was as high as 97% and recall 57%. Since precision was very high, we were assured that criteria were rarely assigned incorrectly, and our numbers presented a lower bound of their presence in EHRs. We then conducted a case study and parsed 4480 new EHRs covering 10 years of surveillance records from the Arizona Developmental Disabilities Surveillance Program. The social criteria (A1 criteria) showed the biggest change over the years. The communication criteria (A2 criteria) did not distinguish the ASD from the non-ASD records. Among behaviors and interests criteria (A3 criteria), 1 (A3b) was present with much greater frequency in the ASD than in the non-ASD EHRs.

CONCLUSIONS: Our results demonstrate that NLP can support large-scale analysis useful for ASD surveillance and research. In the future, we intend to facilitate detailed analysis and integration of national datasets.

Full text links

We have located links that may give you full text access.

Show additional links to paperHide additional links to paper

PubMed

Add to Saved Papers

Get 1-tap access

Related Resources

Challenges in Septic Shock: From New Hemodynamics to Blood Purification Therapies.Fernando Ramasco et al.Journal of Personalized Medicine 2024 Februrary 4

Molecular Targets of Novel Therapeutics for Diabetic Kidney Disease: A New Era of Nephroprotection.Alessio Mazzieri et al.International Journal of Molecular Sciences 2024 April 4

The 'Ten Commandments' for the 2023 European Society of Cardiology guidelines for the management of endocarditis.Michael A Borger, Victoria DelgadoEuropean Heart Journal 2024 April 18

A Guide to the Use of Vasopressors and Inotropes for Patients in Shock.Anaas Moncef Mergoum et al.Journal of Intensive Care Medicine 2024 April 14

Pain during Cesarean Delivery: We Can and Must Do Better.Mark I Zakowski et al.Anesthesiology 2024 April 11

Diagnosis and Management of Cardiac Sarcoidosis: A Scientific Statement From the American Heart Association.Richard K Cheng et al.Circulation 2024 April 19

Essential thrombocythaemia: A contemporary approach with new drugs on the horizon.Francisca Ferrer-Marín et al.British Journal of Haematology 2024 April 9

Eosinophilic Esophagitis: Clinical Pearls for Primary Care Providers and Gastroenterologists.Rohit Goyal, Amrit K Kamboj, Diana L SnyderMayo Clinic Proceedings 2024 April

Executive Summary: State-of-the-Art Review: Unintended Consequences: Risk of Opportunistic Infections Associated with Long-term Glucocorticoid Therapies in Adults.Daniel B Chastain et al.Clinical Infectious Diseases 2024 April 11

For the best experience, use the Read mobile app

Get seemless 1-tap access through your institution/university

For the best experience, use the Read mobile app

All material on this website is protected by copyright, Copyright © 1994-2024 by WebMD LLC.
This website also contains material copyrighted by 3rd parties.

By using this service, you agree to our terms of use and privacy policy.

Your Privacy Choices

You can now claim free CME credits for this literature searchClaim now

Get seemless 1-tap access through your institution/university

For the best experience, use the Read mobile app

Automated Extraction of Diagnostic Criteria From Electronic Health Records for Autism Spectrum Disorders: Development, Evaluation, and Application.

Full text links

Related Resources

Trending Papers

For the best experience, use the Read mobile app