Preservation and perpetuity

Dr Hine writes:

The digitisation phase of the H.W. Cassirer Collection concludes in a matter of weeks. This terminus is a matter of funding limits rather than the exhaustion of possible tasks. As Biblical Studies and Digital Humanities research associate for the project, I am also de facto technical lead. Much of my time is now occupied with the niceties of making the data available in a format friendly to a general audience, transforming structured machine- (and human-) readable data into a more visual format.

The initial project paperwork, an agreement with the estate of the late Ronald Weitzman, commits the University of Sheffield to host Cassirer’s works on “a website in perpetuity”. This is a fragile kind of promise. It is one thing to aim to preserve data long term, and quite another to maintain it in an online portal in the same form ad infinitum.

Inexpert internet users are likely aware that the look, feel, and usability of websites changes over time. The press report data security breaches and hacking; we are prompted to invest in and update anti-virus software, and avoid clicking on links in suspect emails lest a digital worm emerge. Under a website’s bonnet, technology is changing and has to change on a regular basis. The alternative may prove digitally disastrous—not only complete data loss, but harm to those visiting a site. To maintain a website in perpetuity is a non-trivial task.

The pledge of perpetuity has thus plagued my thoughts from day one. Given that the digital commitment had already been made, how could I best honour the promise, in spirit if the letter itself could not be fulfilled?

On planned endings

It so happens that as project ending approaches, a new window has opened, providing intellectual breathing space in relation to this very issue: an online symposium on the topic of Project Resiliency in the Digital Humanities. Spread across four dates so as to accommodate as many time zones as possible, this symposium is a component of The Endings Project, a five-year project funded by Canada’s Social Sciences and Humanities Research Council. The core team are based at the University of Victoria, British Columbia.

Last Thursday, I made a late-night appointment with the first symposium session. Position papers are pre-recorded and shared with panellists, so that in each session speakers give a brief summary of their core points before opening up to panel discussion. As an audience member, I could interact by posting in the Q&A tool.

The first speaker, Sara Diamond, shared two case studies from her own career in order to illustrate the loss of digitised research data and the need to be proactive in maintaining and updating formats. She happened to mention how one host institution had pledged to maintain data in perpetuity. Spoiler: they had lost the server. (A significant dataset was inaccessible, its whereabouts unknown.)

Diamond’s remark prompted me to ask the panellists about perpetuity as a principle. The reaction varied. One respondent judged that five years was a good measure for the life of any web-based digital project. Even CERN, who offer free storage for scientists on the basis that anything is a drop in the ocean compared to the Large Hadron Collider’s output, currently promise only twenty years’ protection. This is field-leading, and extends only to data preservation, i.e., storage, not functionality. (CERN’s repository, Zenodo, might serve to archive a website’s contents, but not host it.)

Digital data may seem futuristic, but without care its lifetime is limited to less than one human generation. Of course, as climate change promises radical upheaval, that may be more than enough to be going on with.

A second helpful observation, and one that chimed with Cassirer plans, was that data storage and the platform on which the data is accessed require different solutions. The company I’m working with on hosting and server maintenance find their horizon maxes out around the five-year mark because of known limits on software support. The transition to different software may prove trivial, but it may not. There is a limit to how much budget could be set aside, though the hope is to run the legacy site for a full decade, and the domain name (cassirer.org) has been purchased on that basis. The digital files will also be deposited long term in at least two repositories. The latter preserves the possibility of an afterlife. In that respect, I suspect Cassirer and Weitzman would have approved.

--Dr Iona Hine, April 2021