To read this content please select one of the options below:

Semantic metadata annotation: tagging Medline abstracts for enhanced information access

Fidelia Ibekwe‐SanJuan (ELICO – University of Lyon, Lyon, France)

Aslib Proceedings

ISSN: 0001-253X

Article publication date: 8 July 2010

492

Abstract

Purpose

The object of this study is to develop methods for automatically annotating the argumentative role of sentences in scientific abstracts. Working from Medline abstracts, sentences were classified into four major argumentative roles: objective, method, result, and conclusion. The idea is that, if the role of each sentence can be marked up, then these metadata can be used during information retrieval to seek particular types of information such as novelty, conclusions, methodologies, aims/goals of a scientific piece of work.

Design/methodology/approach

Two approaches were tested: linguistic cues and positional heuristics. Linguistic cues are lexico‐syntactic patterns modelled as regular expressions implemented in a linguistic parser. Positional heuristics make use of the relative position of a sentence in the abstract to deduce its argumentative class.

Findings

The experiments showed that positional heuristics attained a much higher degree of accuracy on Medline abstracts with an F‐score of 64 per cent, whereas the linguistic cues only attained an F‐score of 12 per cent. This is mostly because sentences from different argumentative roles are not always announced by surface linguistic cues.

Research limitations/implications

A limitation to the study was the inability to test other methods to perform this task such as machine learning techniques which have been reported to perform better on Medline abstracts. Also, to compare the results of the study with earlier studies using Medline abstracts, the different argumentative roles present in Medline had to be mapped on to four major argumentative roles. This may have favourably biased the performance of the sentence classification by positional heuristics.

Originality/value

To the best of one's knowledge, this study presents the first instance of evaluating linguistic cues and positional heuristics on the same corpus.

Keywords

Citation

Ibekwe‐SanJuan, F. (2010), "Semantic metadata annotation: tagging Medline abstracts for enhanced information access", Aslib Proceedings, Vol. 62 No. 4/5, pp. 476-488. https://doi.org/10.1108/00012531011074717

Publisher

:

Emerald Group Publishing Limited

Copyright © 2010, Emerald Group Publishing Limited

Related articles