To read this content please select one of the options below:

Queries related to COVID-19: a more effective retrieval through finetuned ALBERT with BM25L question answering system

Deepthi Godavarthi (Department of Computer Science and Systems Engineering, Andhra University College of Engineering (A), Visakhapatnam, India)
Mary Sowjanya A. (Department of Computer Science and Systems Engineering, Andhra University College of Engineering (A), Visakhapatnam, India)

World Journal of Engineering

ISSN: 1708-5284

Article publication date: 10 May 2021

Issue publication date: 22 February 2022

180

Abstract

Purpose

The purpose of this paper is to build a better question answering (QA) system that can furnish more improved retrieval of answers related to COVID-19 queries from the COVID-19 open research data set (CORD-19). As CORD-19 has an up-to-date collection of coronavirus literature, text mining approaches can be successfully used to retrieve answers pertaining to all coronavirus-related questions. The existing a lite BERT for self-supervised learning of language representations (ALBERT) model is finetuned for retrieving all COVID relevant information to scientific questions posed by the medical community and to highlight the context related to the COVID-19 query.

Design/methodology/approach

This study presents a finetuned ALBERT-based QA system in association with Best Match25 (Okapi BM25) ranking function and its variant BM25L for context retrieval and provided high scores in benchmark data sets such as SQuAD for answers related to COVID-19 questions. In this context, this paper has built a QA system, pre-trained on SQuAD and finetuned it on CORD-19 data to retrieve answers related to COVID-19 questions by extracting semantically relevant information related to the question.

Findings

BM25L is found to be more effective in retrieval compared to Okapi BM25. Hence, finetuned ALBERT when extended to the CORD-19 data set provided accurate results.

Originality/value

The finetuned ALBERT QA system was developed and tested for the first time on the CORD-19 data set to extract context and highlight the span of the answer for more clarity to the user.

Keywords

Citation

Godavarthi, D. and A., M.S. (2022), "Queries related to COVID-19: a more effective retrieval through finetuned ALBERT with BM25L question answering system", World Journal of Engineering, Vol. 19 No. 1, pp. 109-113. https://doi.org/10.1108/WJE-01-2021-0059

Publisher

:

Emerald Publishing Limited

Copyright © 2020, Emerald Publishing Limited

Related articles