Digital Curation Blog

The Digital Curation Centre now has a blog. Well worth following for discussion of policy, service and technical issues surrounding data curation and repositories generally. Here is a snippet:

Digital Curation is maintaining and adding value to a trusted body of digital information over the life cycle of scholarly and scientific materials, for current and future use. It is our belief in the DCC that the curation of digital data requires this whole of life approach. Critical decisions on the curation of data are taken before the data are even created, often at the time the associated project is conceived, or funding is sought. This is not least because curation requires resources that must be allowed for within the work plan. It is increasingly clear that for any project involving data of value, you should provide a data management plan within the project proposal (NSF, 2007).

Digital curation includes good management of data for current purposes, and also in many cases the preservation of those data for the long term. Long term preservation is not necessarily an essential part of curation in all cases, although it is usually a desirable aspect (subject to appraisal and selection decisions). So we can think of curation as having two important components, which we can label “data publication”, for the process of making current data available for use by other contemporaries, and “data preservation”, for the process of making those data available for future users.

Data publication recognises that more and more “reference” works are migrating into the digital domain as curated databases, and that increasingly these are data (or sometimes combinations of data and text) rather than pure text. There are interesting questions on when and how a dataset is “published” such as those raised by Bryan Lawrence in the linked post, but I’ll skip those for now! Such reference datasets can change quite frequently, including the correction or deletion of information as well as the addition of new. The requirements for integrity and stability versus the need for change to promote accuracy bring special problems. During development of the resource, if you are interested in the long term, you will have to ensure that contextual and other information needed for preservation are gathered. This may have to happen even before you make firm decisions about preservation!

As your dataset stabilises, and in particular as it comes out of current use, it may be eligible for long term preservation. This is not an automatic choice, as resources are currently spread too thinly to preserve everything. It will be important in any data management plan to identify candidate archives for preservation that serve the appropriate “designated community” (perhaps your scientific discipline). [Digital Curation Blog]

The author here is Chris Rusbridge, Director of the Digital Curation Centre.

Digital Curation Blog

Generative AI and libraries: seven contexts

The technology career ladder

Presentation: Two Metadata Directions

lorcan dempsey dot net