Queries related to COVID-19: a more effective retrieval through finetuned ALBERT with BM25L question answering system
ISSN: 1708-5284
Article publication date: 10 May 2021
Issue publication date: 22 February 2022
Abstract
Purpose
The purpose of this paper is to build a better question answering (QA) system that can furnish more improved retrieval of answers related to COVID-19 queries from the COVID-19 open research data set (CORD-19). As CORD-19 has an up-to-date collection of coronavirus literature, text mining approaches can be successfully used to retrieve answers pertaining to all coronavirus-related questions. The existing a lite BERT for self-supervised learning of language representations (ALBERT) model is finetuned for retrieving all COVID relevant information to scientific questions posed by the medical community and to highlight the context related to the COVID-19 query.
Design/methodology/approach
This study presents a finetuned ALBERT-based QA system in association with Best Match25 (Okapi BM25) ranking function and its variant BM25L for context retrieval and provided high scores in benchmark data sets such as SQuAD for answers related to COVID-19 questions. In this context, this paper has built a QA system, pre-trained on SQuAD and finetuned it on CORD-19 data to retrieve answers related to COVID-19 questions by extracting semantically relevant information related to the question.
Findings
BM25L is found to be more effective in retrieval compared to Okapi BM25. Hence, finetuned ALBERT when extended to the CORD-19 data set provided accurate results.
Originality/value
The finetuned ALBERT QA system was developed and tested for the first time on the CORD-19 data set to extract context and highlight the span of the answer for more clarity to the user.
Keywords
Citation
Godavarthi, D. and A., M.S. (2022), "Queries related to COVID-19: a more effective retrieval through finetuned ALBERT with BM25L question answering system", World Journal of Engineering, Vol. 19 No. 1, pp. 109-113. https://doi.org/10.1108/WJE-01-2021-0059
Publisher
:Emerald Publishing Limited
Copyright © 2020, Emerald Publishing Limited