Self-Supervised Contextual Language Representation of Radiology Reports to Improve the Identification of Communication Urgency

Title:

Self-Supervised Contextual Language Representation of Radiology Reports to Improve the Identification of Communication Urgency

Link:

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7233055/

Abstract:

Machine learning methods have recently achieved high-performance in biomedical text analysis. However, a major bottleneck in the widespread application of these methods is obtaining the required large amounts of annotated training data, which is resource intensive and time consuming. Recent progress in self-supervised learning has shown promise in leveraging large text corpora without explicit annotations. In this work, we built a self-supervised contextual language representation model using BERT, a deep bidirectional transformer architecture, to identify radiology reports requiring prompt communication to the referring physicians. We pre-trained the BERT model on a large unlabeled corpus of radiology reports and used the resulting contextual representations in a final text classifier for communication urgency. Our model achieved a precision of 97.0%, recall of 93.3%, and F-measure of 95.1% on an independent test set in identifying radiology reports for prompt communication, and significantly outperformed the previous state-of-the-art model based on word2vec representations.

Citation:

Xing Meng, Craig H. Ganoe, Ryan T. Sieberg, Yvonne Y. Cheung, Saeed Hassanpour, “Self-Supervised Contextual Language Representation of Radiology Reports to Improve the Identification of Communication Urgency”, American Medical Informatics Association (AMIA) Summits on Translational Science Proceedings, 2020:413-421, 2020.

Previous
Previous

A Machine Learning Approach for Long-Term Prognosis of Bladder Cancer based on Clinical and Molecular Features

Next
Next

Predicting Colorectal Polyp Recurrence Using Time-to-Event Analysis of Medical Records