Using machine learning to parse breast pathology reports.

Adam Yala, Regina Barzilay, Laura Salama, Molly Griffin, Grace Sollender, Aditya Bardia, Constance Lehman, Julliette M Buckley, Suzanne B Coopey, Fernanda Polubriaginof, Judy E Garber, Barbara L Smith, Michele A Gadd, Michelle C Specht, Thomas M Gudewicz, Anthony J Guidi, Alphonse Taghian, Kevin S Hughes

Breast Cancer Research and Treatment 2017 January

PURPOSE: Extracting information from electronic medical record is a time-consuming and expensive process when done manually. Rule-based and machine learning techniques are two approaches to solving this problem. In this study, we trained a machine learning model on pathology reports to extract pertinent tumor characteristics, which enabled us to create a large database of attribute searchable pathology reports. This database can be used to identify cohorts of patients with characteristics of interest.

METHODS: We collected a total of 91,505 breast pathology reports from three Partners hospitals: Massachusetts General Hospital, Brigham and Women's Hospital, and Newton-Wellesley Hospital, covering the period from 1978 to 2016. We trained our system with annotations from two datasets, consisting of 6295 and 10,841 manually annotated reports. The system extracts 20 separate categories of information, including atypia types and various tumor characteristics such as receptors. We also report a learning curve analysis to show how much annotation our model needs to perform reasonably.

RESULTS: The model accuracy was tested on 500 reports that did not overlap with the training set. The model achieved accuracy of 90% for correctly parsing all carcinoma and atypia categories for a given patient. The average accuracy for individual categories was 97%. Using this classifier, we created a database of 91,505 parsed pathology reports.

CONCLUSIONS: Our learning curve analysis shows that the model can achieve reasonable results even when trained on a few annotations. We developed a user-friendly interface to the database that allows physicians to easily identify patients with target characteristics and export the matching cohort. This model has the potential to reduce the effort required for analyzing large amounts of data from medical records, and to minimize the cost and time required to glean scientific insight from these data.

Full text links

We have located links that may give you full text access.

Show additional links to paperHide additional links to paper

PubMed

Add to Saved Papers

Get 1-tap access

Related Resources

Heart failure with preserved ejection fraction: diagnosis, risk assessment, and treatment.Stephan von Haehling et al.Clinical Research in Cardiology : Official Journal of the German Cardiac Society 2024 April 12

Management of cardiogenic shock: a narrative review.Driss Laghlam et al.Annals of Intensive Care 2024 March 31

Proximal versus distal diuretics in congestive heart failure.Massimo Nardone et al.Nephrology, Dialysis, Transplantation 2024 Februrary 30

Efficacy and safety of pharmacotherapy in chronic insomnia: A review of clinical guidelines and case reports.Alejandro Del Rio Verduzco et al.Mental Health Clinician 2023 October

World Health Organization and International Consensus Classification of eosinophilic disorders: 2024 update on diagnosis, risk stratification, and management.William Shomali, Jason GotlibAmerican Journal of Hematology 2024 March 30

Managing Alcohol Withdrawal Syndrome.Michael Gottlieb, Nicholas Chien, Brit LongAnnals of Emergency Medicine 2024 March 26

Anti-Arrhythmic Effects of Heart Failure Guideline-Directed Medical Therapy and Their Role in the Prevention of Sudden Cardiac Death: From Beta-Blockers to Sodium-Glucose Cotransporter 2 Inhibitors and Beyond.Wael Zaher et al.Journal of Clinical Medicine 2024 Februrary 27

Effectiveness and safety of drugs for obesity.Kristina Henderson et al.BMJ : British Medical Journal 2024 March 26

Perioperative echocardiographic strain analysis: what anesthesiologists should know.Adrian Costescu et al.Canadian Journal of Anaesthesia 2024 April 11

For the best experience, use the Read mobile app

Get seemless 1-tap access through your institution/university

For the best experience, use the Read mobile app

All material on this website is protected by copyright, Copyright © 1994-2024 by WebMD LLC.
This website also contains material copyrighted by 3rd parties.

By using this service, you agree to our terms of use and privacy policy.

Your Privacy Choices

You can now claim free CME credits for this literature searchClaim now

Get seemless 1-tap access through your institution/university

For the best experience, use the Read mobile app

Using machine learning to parse breast pathology reports.

Full text links

Related Resources

Trending Papers

For the best experience, use the Read mobile app