Archiving Websites. A Practical Guide for Information Professionals

Susan Childs (Northumbria University, Newcastle‐upon‐Tyne, UK)

Records Management Journal

ISSN: 0956-5698

Article publication date: 27 February 2007

258

Keywords

Citation

Childs, S. (2007), "Archiving Websites. A Practical Guide for Information Professionals", Records Management Journal, Vol. 17 No. 1, pp. 65-66. https://doi.org/10.1108/09565690710730714

Publisher

:

Emerald Group Publishing Limited

Copyright © 2007, Emerald Group Publishing Limited


As Adrian Brown notes, information on the Web is transient; “the average lifespan of a web page is between 75 and 100 days” (p. 3). There is clearly the need to preserve web content, not only so that information is not lost, but also to keep a record of the development of web technology and web page design. Therefore the publication of this book is very opportune. It is aimed at information professionals who need an introduction and overview of Web archiving at all scales (from the large to the small). It is deliberately not a technical guide, though software tools are listed in an appendix. The author is well qualified to write on this topic being the Head of Digital Preservation at the National Archives.

The contents of the book comprise:

  • Introduction. The need for web archiving and for this book in particular.

  • Development of web archiving. A brief history of the topic.

  • Selection. Developing an appropriate selection policy.

  • Collection methods. The pros and cons of various methods of collecting websites for archiving, which are dictated by the web technology used to produce the sites.

  • Quality assurance and cataloguing. The need for a quality assurance process (as much data is collected automatically and needs to be checked for successful implementation) and cataloguing (including automatic metadata capture).

  • Preservation. The principles and practicalities of long term digital preservation.

  • Delivery to users. Different mechanisms to enable users to access the archived content.

  • Legal issues. Including intellectual property rights and privacy; as the author notes “the web … raises a plethora of complex legal issues” (p. 146).

  • Managing a web archiving programme. The practicalities of establishing, resourcing and maintaining a web archiving programme.

  • Future trends. As the author notes “attempting to forecast future trends … in such a rapidly changing and unpredictable field … is a notoriously inexact science” (p. 184); amongst the issues he covers in this section are international collaboration, international standards and Web 2.0 and the semantic web.

Each chapter is provided with a list of references and web resources (some of which will unfortunately soon become out of date as many sites do not use stable URLs as recommended (p. 66)). At certain points real‐life case studies are presented. This is a good idea, but it would have been nice to see more of them, and for them to be more clearly highlighted in the text. Another nice touch was the model job description for a Web archivist given in Appendix 5 – useful for new entrants to the profession; “as one door closes another one opens”.

This book covers every aspect of setting up and implementing a web archiving programme, in a well‐organised and clearly explained fashion. As well as information professionals (both practitioners and students), it will also be of use to a wider audience of managers, website owners and web masters. It should certainly be in every university library as the student readership for such a book would extend across many disciplines in computing, business, information studies/sciences and records management.

Related articles