To read this content please select one of the options below:

PROBLEMS IN ANALYSIS AND TERMINOLOGY FOR INFORMATION RETRIEVAL

J. FARRADANE (Northampton College of Advanced Technology)
R.K. POULTON (Northampton College of Advanced Technology)
MRS S. DATTA (Northampton College of Advanced Technology)

Journal of Documentation

ISSN: 0022-0418

Article publication date: 1 April 1965

60

Abstract

For the storage of information for subsequent retrieval of desired items, two stages of analysis are essential. The first is the determination of the subject content of a given article or paper; the second is the selection of certain words, groups of words, or classification headings by which the subject content is to be represented, either directly or by a suitable coding. Some workers still look forward to the day when it will be possible for the whole of a text to be read and ‘understood’ automatically by a machine; the ‘understanding’ process has been envisaged either as a process of selection of terms by the measure of word frequency, or word‐pair frequency (adjacent terms or terms not too far separated in one sentence) in the text, or by some process of automatic linguistic analysis. Such methods appear unsuitable for several reasons: language, as normally used, is a very difficult medium for exact expression (hence the value of mathematics) and few authors write well enough to avoid all ambiguities; a human reader accustomed to the subject can easily overcome any difficulties due to poor grammar, badly expressed arguments, excess brevity or prolixity in writing and even, sometimes, actual errors; a machine can not do so. Furthermore, the content of a paper is rarely of uniform importance throughout, and it is not worth recording, for subsequent retrieval, details which are merely repetitions of matters described earlier and better elsewhere, and not essential to the main purpose of the paper; for example, in a paper on evaporator design, a description of a standard method of analysis, applied to the contents of the evaporator in determining the efficiency of the design, will not be worth indexing; in a search for analytical methods, retrieval of such a paper would hardly be considered pertinent. A human reader, though far from infallible, can usefully make such judgments.

Citation

FARRADANE, J., POULTON, R.K. and DATTA, M.S. (1965), "PROBLEMS IN ANALYSIS AND TERMINOLOGY FOR INFORMATION RETRIEVAL", Journal of Documentation, Vol. 21 No. 4, pp. 287-290. https://doi.org/10.1108/eb026380

Publisher

:

MCB UP Ltd

Copyright © 1965, MCB UP Limited

Related articles