To read this content please select one of the options below:

New possibilities for metadata creation in an institutional repository context

Alan Burk (University of New Brunswick, New Brunswick, Canada)
Muhammad Al‐Digeil (University of New Brunswick, New Brunswick, Canada)
Dominic Forest (Université de Montréal, Montréal, Canada)
Jennifer Whitney (University of New Brunswick Libraries, New Brunswick, Canada)

OCLC Systems & Services: International digital library perspectives

ISSN: 1065-075X

Article publication date: 6 November 2007

1276

Abstract

Purpose

The purpose of this paper is to develop automated methods for creating metadata for documents in an institutional repository.

Design/methodology/approach

Two methods are examined for automatically building metadata in an institutional repository context. Text mining techniques are employed to discover relationships among documents with similar content, from which are inferred possible values for missing or incomplete metadata elements. Machine learning techniques are used to identify and extract specific metadata element values from document content.

Findings

Text mining techniques can be used to cluster documents with similar content. This allows values for metadata elements, like keyword, to be projected from documents with established metadata to related documents. Machine learning techniques are found to be reasonably accurate for extracting from documents values for metadata elements, such as, title, author, and abstract. Results show sufficient promise to support the next phase of the project: the development of assistive tools for use by metadata specialists to create or edit document metadata.

Originality/value

This paper focuses on the use of automated metadata extraction techniques to assist metadata creation, lessening the time and effort required to add documents to institutional repositories.

Keywords

Citation

Burk, A., Al‐Digeil, M., Forest, D. and Whitney, J. (2007), "New possibilities for metadata creation in an institutional repository context", OCLC Systems & Services: International digital library perspectives, Vol. 23 No. 4, pp. 403-410. https://doi.org/10.1108/10650750710831547

Publisher

:

Emerald Group Publishing Limited

Copyright © 2007, Emerald Group Publishing Limited

Related articles