Articles on structured data: matching, mining and mixing

Lorcan 2 min read

The current issue of Library Resources and Technical Services (not on the web) has a couple of interesting articles which touch on the complications of processing inconsistent data.
Creating organization name authority within an ERM system
Kristen Blake and Jackquie Samples
LRTS 53(2) April 2009 p 94-107
This article looks at issues of Organization name consistency within ERM systems. There is some general discussion of structured data within ERM systems followed by a specific focus on organization names. Some OCLC initiatives are discussed as part of the environmental analysis, and several people managing ERM functions in libraries are interviewed. Work on this issue at NCSU is then described. The authors conclude by suggesting that such work provides local benefits and pointing to the advantages of greater community discussion and agreement.
Automated metadata harvesting: low-barrier MARC record generation from OAI-PMH repository stores using MarcEdit.
Terry Reese
LRTS 53(2) April 2009 p 121-134
This article outlines an environment in which a library is interested in metadata for digital resources from multiple sources, often available via OAI-PMH as organizations expose data about their digital collections. A particular requirement is explored: to convert such multiple streams into MARC for internal library use. This involves crosswalking between metadata from different creation regimes and MARC. A particular issue here is the absence of widely deployed content standards which means that inconsistent approaches to data creation have to be managed. OCLC services are also considered as part of the environmental analysis. The article then considers how MarcEdit, a tool created by the author, can be used to address the requirements described.
Each of these articles discusses a topic that is the subject of active OCLC interest and development attention. But it was less this that prompted me to connect them than the discussion of where and why consistency of data is important, and what complications for processing arise in the absence of such consistency.
Library standards-making has often seemed to emerge from social consensus-making and a type of anticipative refinement to meet a wide range of cases. As libraries exercise data in a variety of applications, as data flows between systems more, as matching, mining and mixing become important to support services, then there is more evidence available about where structure and consistency are important and what trade-offs are involved (between upstream flexibility in data creation practices and downstream processing and use, for example). This should improve the ways in which we think about standardisation and best practices.

More from
The technology career ladder

The technology career ladder

Library leaders should be drawn from across the organization. Any idea that technology leaders are overly specialised or too distant from general library work is outmoded and counter-productive.
Lorcan 7 min read

Lorcan Dempsey dot Net

The social, cultural and technological contexts of libraries, services and networks

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.