Knowledge Discovery in Bibliographic Databases

Rodney Brunt (Leeds Metropolitan University)

Library Review

ISSN: 0024-2535

Article publication date: 1 July 2001

355

Keywords

Citation

Brunt, R. (2001), "Knowledge Discovery in Bibliographic Databases", Library Review, Vol. 50 No. 5, pp. 265-266. https://doi.org/10.1108/lr.2001.50.5.265.12

Publisher

:

Emerald Group Publishing Limited


This is a collection of papers on a relatively new area of research which has been attracting the attention of people in the area of library and information science. However, as the editors say, recent publication in the field of knowledge discovery in databases (KDD) has been primarily focussed on the viewpoints from outside the library and information science sphere. This issue of Library Trends attempts to redress the balance by assembling a series of essays which treat the topic clearly from the standpoint of this subject area.

KDD procedures attempt to retrieve information which is not immediately made visible by conventional information retrieval techniques and in so doing draws from a range of techniques and methodologies from fields such as scientific research and business market research. According to Fayyad (quoted in the introduction) “KDD refers to the overall process of discovering useful knowledge from data and data mining refers to a particular step in the process”; while data mining itself essentially focusses on patterns previously not recognised. In this, once‐hidden patterns are revealed by rigorous application of techniques, both human and computerised, and it involves careful screening of those revealed connections to ensure the elimination of those found to be not useful or relevant.

The 13 papers are preceded by an introduction by the editors, both of whom are responsible for substantive contributions. The papers range from those which will be readily accessible to conventional librarians and information scientists, drawing as they do on experimentation based on well known concepts and applications, to those which are plainly harder work, since they are based in fields such as linguistics and statistical analysis.

The theoretical basis having been laid in Norton’s paper on KDD and Kwasnik’s on the role of classification in the approach, practical applications are described in Swanson and Smalheiser on linkages between Medline records and Cory on similar work using the Humanities Index. Small’s contribution presents a methodology for creating pathways through scientific literature, especially where discipline boundaries are crossed, in an effort to take advantage of diversity of treatment of topics rather than depending on the more usual homogeneous approach characteristic of normal online information retrieval.

Papers by Jian Qin on bibliographic coupling, Qin He on co‐word analysis and Ahonen on use of frequent word sequences examine the use of various semantic approaches; while in their paper on the use of abstracts and abstracting, Pinto and Lancaster revisit well‐established information retrieval approaches in this new context. Chowdhury explores the use of template mining in information extraction, which depends on natural language processing to extract information on the basis of recognisable patterns. The paper describes how templates are used in some Web search engines.

Papers on CINDI (Desai, Shinghal, Shayan and Zhou) and on knowledge discovery in spatial cartographic information retrieval (Yu) describe practical applications, in virtual libraries and geographic systems respectively, based on the metadata associated with the items in the respective collections.

To round off the issue White usefully raises the question of tail‐ or dog‐wagging in the context of the relationship between librarians and electronic technology. It represents a call to reverse the de‐skilling – letting librarians do what they are good at (or should be allowed to become good at) and leaving the end users in the pursuit of their specialisms supported appropriately in their information needs. In too many instances we find the re‐invention or re‐prioritising of aspects of information retrieval to the detriment of the enquirer whether as information retrieval system end user or as client of the professional intermediary. White reminds us that it will be knowledge workers (as opposed to computer specialists) who will be the most important in this century and it is among those with the skills of the reference librarian that these vital workers will be found.

This is a valuable set of papers which bring a rarefied topic down to earth and make it available to the more conventional library community.

Related articles