About the PACSCL Diaries Project
Technical Specifications: technical information about accessing digital images from the OPenn website, and about the conventions and standards used in creating the data
Metadata Data Entry Guide (.pdf)
Metadata Spreadsheet Template (.xlsx)
The Philadelphia Consortium of Special Collections Libraries is pooling its resources to create a shared open access website hosting a selection of digitized diaries from its various institutions. Images and metadata describing the diaries are available for download and reuse. (This pilot site also includes a simple interface for viewing the diaries). The objectives of this project are:
- To create a digital web site of historic diaries, freely available at full resolution with appropriate metadata, using in-kind contributions from its members and modest financial support from PACSCL.
- To develop an infrastructure for the hosting of digital objects that is secure, sustainable, scalable, and affordable.
- To demonstrate PACSCL’s value to both the scholarly and the library community via a branded interface that allows easy discovery, access, and downloading.
- To demonstrate the value of open data in collaborative endeavors.
- To engage all members of the PACSCL community in the shared goal of exposing their unique collections.
The site offers any user the ability to download or view facsimiles of the diaries along with metadata. The University of Pennsylvania Libraries is hosting the project, with staff at the Kislak Center for Special Collections, Rare Books and Manuscripts managing the data and image delivery process and PACSCL itself providing this interface. Participating institutions deliver images of the diaries and associated descriptive metadata to the Kislak Center, conforming to the guidelines in the project documentation. Particular attention is called to the licensing: all data in the project is released under nothing more restrictive than a Creative Commons BY license.
PACSCL is asking those members with robust digital infrastructure to take the lead by digitizing, storing and preserving, not only their own diaries, but also a small number of diaries from an institution (or institutions) that have more limited technological capabilities. PACSCL is working to pair these institutions together. PACSCL will also supply contacts within its experienced community to advise institutions on the preparation of the images and associated metadata.
For this pilot project, the site currently includes approximately 35 diaries, with more in process. Most are materials that have been previously digitized, with a small number digitized by the Schoenberg Center for Electronic Text and Image to test the model of collaboration between larger and smaller institutions. The high resolution images and metadata are fully downloadable and available at OPenn: Primary Digital Resources Available to Everyone.
This is a PACSCL community project. It is funded by in-kind contributions from PACSCL members, and the site will be financially sustained by PACSCL. The underlying site has been designed to be extremely cost effective, and it is hoped that costs of production will be kept to a minimum through the simplicity of the data collection and the distributed nature of the endeavor.
Why such a site?
As the PACSCL finding aids database has demonstrated, a single-site corpus of material from all institutions is less expensive, takes less maintenance, is more reliable, and delivers a better product than if an interface is built off just one institution’s holdings. In trying to create a functional, low-maintenance, consistent interface for a PACSCL presentation of our joint assets, the project team has chosen to have that interface draw data from one single portal, where metadata and images are stored in consistent and standard ways, rather than aggregate varied data from (potentially) as many repositories as there are members of PACSCL.
Why an Open Site?
For this project, it makes sense for the data to be openly available for use. Specifically, both image data, and metadata cannot be hosted if they have rights attached to them that are more restrictive than those effected by the employment of a Creative Commons attribution License (CC-BY). Of course this means that images hosted under a Creative Commons 4.0 license, or which are in the public domain, can also be hosted. In this way, all assets can be termed “Free Cultural Assets”. The reasons for this are manifold.
- It promotes, and reduces the cost of, collaboration between institutions. Institution X is less likely to image, and even less likely to archive, institution Y’s assets, if they are encumbered by rights effectively limiting what they can do with those assets.
- It enables flexibility in the presentation and archiving of the assets, and this might be crucial as standards and funding models change.
- Because this licensing allows anybody to do what they want with the materials, it allows institutions themselves to present their own materials, quite independently of PACSCL in the way that they would like.
- The site is open precisely so that all the data on it can be effectively and easily downloaded. This is incompatible with licenses more restrictive than the ones stipulated.
- It is also important to the sustainability of the project that the archive is inexpensive to put together, and inexpensive to maintain. This is difficult to do if the assets are not all subject to the same standards, in licensing, as well as in metadata and format.
In short, this project will not work except as an open project. By taking part in the project, institutions will be giving up their ability to require payment for access to these digital images and metadata, and will not have control over their use by others. However, this decision will have a broad range of positive impacts that would be difficult or impossible to achieve otherwise. For example, through the open diaries project institutions will be able to:
- Get their assets digitized in the first place, conserved in repositories, and hosted publicly.
- Contribute to a digitization project that is more than the sum of the parts of any one institution.
- Make their collections more available, in more useful ways, to more people than is possible under any other model.
- Demonstrate that PACSCL can collaborate effectively on joint digitization projects, even in the absence of outside funding.
Staff at participating libraries are responsible for providing images (or diaries for imaging), together with metadata for their diaries. Laura Blanchard, PACSCL, provides general administrative support and has designed and built this WordPress installation, with technical advice and assistance from Dennis Mullen (Kislak Center). Special thanks are due to the Kislak Center’s project team — Will Noel,Jessie Dummer, Doug Emery, Mitch Fraas,Holly Mengel, Dennis Mullen, and Dot Porter — together with the staff at the Schoenberg Center for Electronic Text and Image at the University of Pennsylvania Libraries — for imaging diaries, \uploading images and metadata to the OPenn repository, and creating page-turners for each diary.