To read this content please select one of the options below:

Negation detection and word sense disambiguation in digital archaeology reports for the purposes of semantic annotation

Andreas Vlachidis (Hypermedia Research Unit, University of South Wales, Pontypridd, Wales.)
Douglas Tudhope (Hypermedia Research Unit, University of South Wales, Pontypridd, Wales.)

Program: electronic library and information systems

ISSN: 0033-0337

Article publication date: 7 April 2015

890

Abstract

Purpose

The purpose of this paper is to present the role and contribution of natural language processing techniques, in particular negation detection and word sense disambiguation in the process of Semantic Annotation of Archaeological Grey Literature. Archaeological reports contain a great deal of information that conveys facts and findings in different ways. This kind of information is highly relevant to the research and analysis of archaeological evidence but at the same time can be a hindrance for the accurate indexing of documents with respect to positive assertions.

Design/methodology/approach

The paper presents a method for adapting the biomedicine oriented negation algorithm NegEx to the context of archaeology and discusses the evaluation results of the new modified negation detection module. A particular form of polysemy, which is inflicted by the definition of ontology classes and concerning the semantics of small finds in archaeology, is addressed by a domain specific word-sense disambiguation module.

Findings

The performance of the negation dection module is compared against a “Gold Standard” that consists of 300 manually annotated pages of archaeological excavation and evaluation reports. The evaluation results are encouraging, delivering overall 89 per cent precision, 80 per cent recall and 83 per cent F-measure scores. The paper addresses limitations and future improvements of the current work and highlights the need for ontological modelling to accommodate negative assertions.

Originality/value

The discussed NLP modules contribute to the aims of the OPTIMA pipeline delivering an innovative application of such methods in the context of archaeological reports for the semantic annotation of archaeological grey literature with respect to the CIDOC-CRM ontology.

Keywords

Acknowledgements

While the authors wish to acknowledge support from the ARIADNE project (FP7-INFRASTRUCTURES-2012-1-313193), the views expressed are those of the authors and do not necessarily reflect the views of the European Commission.

Citation

Vlachidis, A. and Tudhope, D. (2015), "Negation detection and word sense disambiguation in digital archaeology reports for the purposes of semantic annotation", Program: electronic library and information systems, Vol. 49 No. 2, pp. 118-134. https://doi.org/10.1108/PROG-10-2014-0076

Publisher

:

Emerald Group Publishing Limited

Copyright © 2015, Emerald Group Publishing Limited

Related articles