Search
  Advanced Search
 
Journal search
Journal cover: Library Hi Tech

Library Hi Tech

ISSN: 0737-8831

Online from: 1983

Subject Area: Library and Information Studies

Content: Latest Issue | icon: RSS Latest Issue RSS | Previous Issues

 

Previous article.Icon: Print.Table of Contents.Next article.Icon: .

Heuristics for identification of bibliographic elements from verso of title pages


Document Information:
Title:Heuristics for identification of bibliographic elements from verso of title pages
Author(s):A.R.D. Prasad, (Associate Professor in the Documentation Research and Training Centre, Indian Statistical Institute, Bangalore, Karnataka, India), Durga Sankar Rath, (Lecturer in the Department of Library and Information Science, Ravindra Bharati University, Kolkata, India)
Citation:A.R.D. Prasad, Durga Sankar Rath, (2004) "Heuristics for identification of bibliographic elements from verso of title pages", Library Hi Tech, Vol. 22 Iss: 4, pp.397 - 403
Keywords:Bibliographic systems, Cataloguing, Classification schemes, Data handling, Information operations
Article type:Research paper
DOI:10.1108/07378830410570502 (Permanent URL)
Publisher:Emerald Group Publishing Limited
Abstract:This paper presents a methodology to capture bibliographic data from the verso of the title pages of documents. A survey has been undertaken to identify the syntactic and semantic features of bibliographic elements on the verso of title pages. These features include the font size, line numbers and appearence of certain string of characters. Emphasis is given to the study of “cataloguing-in-publication” data. The results of the survey are used to develop heuristics which can help in developing a program to automatically identify the various bibliogaphic data elements. The back of the title pages are scanned and stored as HTML pages using optical recognition software. The heuristics are then applied on the HTML pages. Few samples of input and the output generated are presented. Finally, the problems related to OCR and the heuristics are enumerated.



Fulltext Options:

Login

Login

Existing customers: login
to access this document

Login


- Forgot password?
- Athens/Institutional login

Purchase

Purchase

Downloadable; Printable; Owned
HTML, PDF (514kb)

Due to our platform migration, pay-per-view is temporarily unavailable.

To purchase this item please login or register.

Login


- Forgot password?

Recommend to your librarian

Complete and print this form to request this document from your librarian


Marked list


Bookmark & share

Reprints & permissions