Archival Unit

From Adpnwiki
Jump to navigation Jump to search

An Archival Unit (AU) is the basic unit of preservation on the ADPNet LOCKSS network. Each AU has a Manifest Page and a collection of digital assets that the manifest page links to.

From "Getting Started with LOCKSS" (June 2016)

Each Archival Unit (AU) must be Web-accessible so the LOCKSS daemon can get at it. That means that it needs to be put on a Web-accessible server computer. If you have a server, but it isn't Web-accessible or access to it is blocked (at the firewall, for example), the LOCKSS crawl won't work. You can use a firewall to protect your collection, but make sure the LOCKSS daemon has access. Check with your IT support person or system administrator to confirm that this is the case.
[...]
The next major part of getting collections ingested is preparing each collection’s manifest page. The manifest page is a basic HTML document that contains at least two things: the permission statement that gives the LOCKSS daemon the right to harvest the collection (at the foot of the document) and the base URL of the AU for the collection (or for a fraction of the collection).
Here is an example of a fairly simple manifest page for a fairly simple collection (Auburn University’s Alabama Postcards images collection). Feel free to use it as a template.
ExampleAlabamaPostcardsCollectionManifest.png
The base URL isn't visible if you've opened the manifest in a browser window, because it's located inside the HTML <a href> tag. To see this tag, open the HTML file in DreamWeaver or Notepad++ . If you're opening it in a browser window, go to View => Source to see the tags.
I usually put all the URLs for the sub-folders in my AU, but the LOCKSS folks tell me it isn't really necessary, as the daemon will go to all the folders inside the original unless you tell it not to. So you could leave out everything except the base URL. NOTE: These URLS are not the CONTENTdm collection URLs; they're the URLs for the collections (or AUs) on your Web server.
The manifest page should be kept in the same folder as the AU so that the LOCKSS daemon can find it easily. If you break a collection up into digestible chunks (50 GB or so), you should have a separate manifest file for each AU chunk, with each AU chunk in a separate folder, along with its own manifest page. I also like to do a metadata export from the relevant CONTENTdm collection as a tab-delimited text file and put a copy of that in each folder also, in case the CONTENTdm collection needs to be reconstituted after a disaster.
So, if you have one AU for a collection, you need one manifest with the appropriate base URL. If you have two AUs, you need two manifests, one in each AU folder, each with the appropriate base URL for that particular AU. Here is a screenshot of the folder directory for the Alabama Postcards AU, so you can see what is in the folder. The manifest has a little box around it, so you can't miss it.
ExampleAlabamaPostcardsCollectionDirectoryListing.png
—From "Getting Started with LOCKSS" (June 2016) by Midge Coates