Web Content Archiving Tools/Resources

There can be a lot of overlap between web archiving and social media archiving for social media sites that are on the web. See also: Social Media Archiving Resources.

HANDOUT

Keeping Personal Websites, Blogs and Social Media (Library of Congress)

http://www.digitalpreservation.gov/personalarchiving/documents/PA_websites.pdf

This 1-page brochure clarifies some priorities and objectives of archiving websites.

VIDEOS

Web Archiving (3:48) (Library of Congress)

http://www.digitalpreservation.gov/multimedia/videos/webarch09.html

Web content changes all the time. If we don’t save that content before it disappears, a major part of our cultural history will be lost. The Library of Congress is working to provide permanent access to web content of historical importance. It selects websites for collection, requests permissions from the website owners, addresses the technology of collecting websites and preserves the websites and makes them available. This video examines those four challenges.

Preserving Personal Web Content (8:06) (Library of Congress)

http://www.digitalpreservation.gov/multimedia/videos/personalarchiving-webcontent.html

On May 10, 2010, the Library of Congress held Personal Archiving Day in conjunction with the American Library Association’s annual Preservation Week. […] In this video, Abigail Grotke, web archiving team lead and Gina Jones, digital media project coordinator, both from the Office of Strategic Initiatives’ Web Archiving team at the Library of Congress, offer practical advice on preserving web content.

TOOLS

Creating a Static Copy of a Website

https://swsblog.stanford.edu/blog/creating-static-copy-website

This Standford University blog post includes descriptions and advice on several different tools available to archive websites.

Scrapbook Firefox Extension

https://addons.mozilla.org/en-US/firefox/addon/scrapbook/

ScrapBook is a Firefox extension, which helps you to save Web pages and easily manage collections. Key features are lightness, speed, accuracy and multi-language support.

WordPress Backups

https://codex.wordpress.org/WordPress_Backups

Instructs users on how to create and save backups of WordPress web pages. Covers multiple methods, including plugins and software.

REPORTS

Truman, G. (2016). Web archiving environmental scan. (Harvard Library Report). Retrieved from https://dash.harvard.edu/handle/1/25658314.

This report from January 2016, documents web archiving programs from 23 institutions from around the world, focusing on libraries, museums, and archives.

 

Pennock, M. (2013). Web-Archiving, DPC Technology Watch Series. (Salisbury, UK: Charles Beagrie Ltd., March 2013), (DPC Technology Watch Series No. DPC Technology Watch Report 13-01). Salisbury, UK: Charles Beagrie Ltd. Retrieved from http://dx.doi.org/10.7207/twr13-01

This report provides an overview of the issues surrounding lost content related to digital cultural memory and the technology involved.

SCHOLARLY ARTICLES

Niu, J. (2012a). An overview of web archiving. D-Lib Magazine18(3/4). Retrieved from http://doi.org/10.1045/march2012-niu1

This overview is a study of the methods used at a variety of universities, and international government libraries and archives, to select, acquire, describe and access web resources for their archives.

 

Niu, J. (2012b). Functionalities of web archives. D-Lib Magazine18(3/4). Retrieved from http://doi.org/10.1045/march2012-niu2

The functionalities that are important to the users of web archives range from basic searching and browsing to advanced personalized and customized services, data mining, and website reconstruction. The author examined ten of the most established English language web archives to determine which functionalities each of the archives supported, and how they compared.