Disclosure and repositories again

Lorcan 1 min read

In the context of talking about disclosure I had it on my list to note Google’s Sitemap some time, and in particular the use of OAI-PMH in this context.

The Sitemap Protocol allows you to inform search engines about URLs on your websites that are available for crawling. In its simplest form, a Sitemap that uses the Sitemap Protocol is an XML file that lists URLs for a site. The protocol was written to be highly scalable so it can accommodate sites of any size. It also enables webmasters to include additional information about each URL (when it was last updated; how often it changes; how important it is in relation to other URLs in the site) so that search engines can more intelligently crawl the site. [Google Webmaster Tools]

And the use of OAI-PMH:

OAI-PMH (Open Archives Initiative Protocol for Metadata Harvesting) —This is an application-independent interoperability framework based on metadata harvesting. Generally, you would use this format only if you already have a site that uses this protocol. You can’t use this format for Mobile Sitemaps. If you use this format for your site, simply add the baseURL of your OAI repository (for instance, [Webmaster Help Center – What other formats can I use for a Sitmap?]

I was reminded of this by Terry Reese’s post about ContentDM, OAI and Google, where he ‘discloses’ the contents of the repository to Google using ContentDM’s OAI harvesting feed for the sitemap.
Related entries:

More from
The technology career ladder

The technology career ladder

Library leaders should be drawn from across the organization. Any idea that technology leaders are overly specialised or too distant from general library work is outmoded and counter-productive.
Lorcan 7 min read

Lorcan Dempsey dot net

Deep dives and quick takes: libraries, society, culture and technology

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.