Web Archiving

Tony Rodrigues (University of South Africa)

Online Information Review

ISSN: 1468-4527

Article publication date: 27 November 2007

289

Keywords

Citation

Rodrigues, T. (2007), "Web Archiving", Online Information Review, Vol. 31 No. 6, pp. 910-911. https://doi.org/10.1108/14684520710841883

Publisher

:

Emerald Group Publishing Limited

Copyright © 2007, Emerald Group Publishing Limited


The book addresses the problem of preserving the vast amount of information available on the web. Each chapter is a contribution from different computer scientists and librarians that are involved in web preservation or archiving.

The first chapter is as an introduction to what web archiving is, and why the preservation of web information is important as a record of the culture and history of today's society. The need to guarantee the preservation and access of current web information for future generations is explained. The chapter is also a review of problems that web preservation raises and the methods that have been developed to overcome these. In the second chapter the authors provide an overview of the methodological approaches researchers used to study the web, referred to as web studies. Methods include content analysis, surveys, visual and network analysis. Chapter 3 examines the key phase in web archiving – that of the selection of web information for long‐term preservation. The author emphasises the importance of archiving institutions developing a selection policy, just as they would for printed materials. In Chapter 4 the contributing author looks at different ways of copying and preserving entire web sites which mirror the original web site at different times, as a means of web archiving. The following chapter deals with archiving the so‐called hidden web, i.e. the web that is not accessible by search engines, while Chapter 6 examines the challenges relating to access and finding aids for archived web materials.

Chapter 7 discusses the usage of web archives. The author explains that usage‐related issues should be viewed from an organisational, functional as well as a technological perspective. Chapter 8 discusses the actual long‐term preservation of web content. Digital preservation strategies, such as emulation and migration, are noted, as is the key role of metadata.

Chapter 9 traces the establishment of the Internet Archives, which strives to be a holistic web archiving project. The author describes the mission and goals of the archives and its future. The author also explains the importance of the projects in which the archives is involved, such as the creation of a mirror site at the Library of Alexandria. In contrast, the final chapter examines an example of a small‐scale academic web archiving project, the Digital Archive for Chinese Studies, as another possible solution towards web archiving.

The book presents a useful index, as it helps bring together related issues presented by the various contributing authors. There are also a number of relevant figures. A comprehensive list of references after each chapter is also given.

Certain chapters do present technical information that may be difficult for readers that are new to the field of web archiving. However, as a valuable source of information offered by a range of experts in the field, this book is recommended reading for archivists, records managers, librarians, computer scientists and information managers, that is, all information practitioners interested in the what, why and how of preserving information available on the web.

Related articles