DeBERTa-BiLSTM: A multi-label classification model of Arabic medical questions using pre-trained models and deep learning.

Bushra Salem Al-Smadi

Computers in Biology and Medicine 2024 January 5

It is wise to investigate past and present epidemics in the hopes of profiting from them and being better prepared for future ones. COVID-19 is one of the most recent and well-known pandemics; its effects are still felt today. Most or nearly all governments have announced various measures to combat the virus, making it challenging to keep people aware of the most up-to-date and relevant information. As a result, many websites have created and maintained Frequently Asked Questions (FAQs) regarding the pandemic. People naturally tend to ask about multiple points in one question, leading to multi-label questions. Multi-label questions classification is one of Natural Language Processing's (NLP) most common and complicated tasks. One of classification's most significant contributions to advancing medical care and facilities is the development of automated question-and-answer systems. These systems can improve the efficiency of healthcare by reducing the burden on healthcare professionals and providing patients with timely and reliable answers to their questions. Due to the Arabic language's intricate morphology and structure, such a task becomes more challenging when dealing with Arabic text. This study aims to build a multi-label classification model for Arabic medical questions. The investigation of pre-trained neural models significantly improved NLP performance. Recently, pre-trained models have been used in multi-label classification. This study proposes a deep learning model for classifying Arabic multi-label COVID-19 questions by combining the strengths of DeBERTa (Decoding-enhanced BERT with Disentangled Attention) and BiLSTM (Bidirectional Long Short-Term Memory) networks. Deep learning methods are prevalent because they generate dense feature representations automatically and implicitly capture hidden relationships. The DeBERTa model is fine-tuned to generate the representation of word vectors. The BiLSTM model is fed word vectors to extract and represent features deeply. The proposed multi-label classification model categorizes questions into one or more available ten categories. The deep learning model is evaluated using hamming loss, micro-precision, micro-recall, micro-F1, subset accuracy, AUC, and Jaccard index. It showed an effective classification for Arabic questions with encouraging performance. The proposed model achieved values of 0.042 for hamming loss, 0.84 for micro-precision, micro-recall, and micro-F1, 0.71 for subset accuracy, 0.89 for AUC, and 0.72 for Jaccard index. Therefore, this paves the way for adopting an automated multi-label classification model for medical questions in health facilities. Which can help telehealth medical providers present more reliable and effective consultations.

Full text links

We have located links that may give you full text access.

Show additional links to paperHide additional links to paper

PubMed

Add to Saved Papers

Get 1-tap access

Related Resources

For the best experience, use the Read mobile app

Get seemless 1-tap access through your institution/university

For the best experience, use the Read mobile app

All material on this website is protected by copyright, Copyright © 1994-2024 by WebMD LLC.
This website also contains material copyrighted by 3rd parties.

By using this service, you agree to our terms of use and privacy policy.

Your Privacy Choices

You can now claim free CME credits for this literature searchClaim now

Get seemless 1-tap access through your institution/university

For the best experience, use the Read mobile app