Many academic libraries have implemented institutional repositories (IR), which offer some potential for publishing and archiving data but can fall short of usability expectations and requirements. As general purpose repositories, IR's may lack the metadata and service capabilities to dynamically represent data and to effectively capitalize on complex, domain oriented metadata schema such as the Ecological Metadata Language (EML) or the ISO 19115-2014 Geographic information metadata standards. For libraries interested in providing data publishing services, one alternative to the domain agnostic nature of IR content and metadata models is to maintain the IR for traditional document types and in addition stand up any of a number of more data-ready applications such as CKAN or the Dataverse Network. This strategy however presents some drawbacks: In addition to creating what may be a confusing proliferation of library-hosted repositories, there are often limitations on the available human and financial resources needed to maintain parallel services.
In light of these considerations, there are arguments in favor of using IR's for both the publication of active, production data as well as the long term preservation of static or archival data. Currently, it is the rule rather than the exception for a given research domain to lack established repositories. Additionally, the skills required to support IR architectures and policies are readily available within academic library settings. As a low barrier option for meeting the growing number of federal and other funder sharing requirements, IRs further satisfy what may for many researchers be an otherwise burdensome or vague mandate. Finally, even in cases where production or current data has been published via a domain repository, storage constraints and the lack of preservation features within many data sharing platforms highlight the utility of IRs to serve an archival mirroring function that is complementary to the services provided by domain repositories.
Academic library IRs are therefore not only legitimate platforms for data publication, but because they are often built using extensible, open source applications such as DSpace, they may be quickly adapted to provide data friendly features without straining available human or technical resources. In this chapter we describe the requirements and proposed benefits of an IR service focused on archival mirroring of collections previously published within domain repositories but scheduled or otherwise designated for migration to an alternative preservation platform. Beginning with outreach strategies and making the case for such a service, we further document methodologies for characterizing and modeling domain repository features within a general purpose IR and enhancing the discovery of and access to archived data through OAI-PMH services and the new DSpace API.
Association of College and Research Libraries
Curating Research Data
data curation, data archiving, data preservation
Wheeler, Jonathan. "Extending Data Curation Service Models for Academic Library and Institutional Repositories." In Curating Research Data, ed. Lisa Johnston 1:171–92. Chicago, Illinois: Association of College and Research Libraries, 2017.
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.