This pages should describe the very basics of LOCKSS for new members of the preservation network.
Lots of Copies Keeps Stuff Safe
LOCKSS -- Lots Of Copies Keeps Stuff Safe
LOCKSS is a preservation network that operates on the premise that the most surefire approach to long term preservation is by creating multiple replicas of data and distributing those replicas across geographies and hardware.
While one could make multiple copies of data on various media and mail those media to random corners of the globe, the media themselves are not interconnected. Invariably, when data is replicated across discrete media, discrepancies between sets of data will emerge. That is, without checking and monitoring for changes, one will find that two presumably identical sets of data are in fact not identical. At that point it would be a challenge to determine the correct data.
LOCKSS creates a preservation network where identical replicas of data are distributed across multiple servers, and the servers, while independent, are connected over network communications. On each server resides a full copy of the preservation data. In order to monitor for discrepancies the preservation servers participate in polls by creating a hash of the preserved content and send that hash as a vote. LOCKSS has elaborate logic to determine when action is taken or not, but it boils down to this:
- the votes are tallied to see if a quorum exists
- If a quorum exists and a preservation node finds itself in the minority, the preservation node will correct its own copy of the content to align with the majority