Discovery and disclosure

Science Library Pad has a couple of posts about libraries and the long tail. He makes the following interesting point contrasting ‘availability’ with ‘discoverability’:

For example, PhotoBucket is in the availability business. You get a bucket of storage, you dump your photos in. It is mostly not in the discoverability business. That’s up to the users, as they post the photos in various places on the net. I would also consider Amazon S3 and Open Access repositories to be mainly in the availability business.

Google, of course, is a classic example of a discoverability business. And I think it’s really in understanding the differences between availability and discoverability that we can learn a lot about our businesses.

Libraries are mainly about availability, as far as I’m concerned. I think one of the big conflicts has been that some libraries thought they were in the discoverability business, this is why they perceive Google to be a competitor or a threat. One of the big areas of confusion, I think, is that physical availability is about providing the container. If I can find the book in its one-and-only-one possible shelf location, then I can provide you with the service. In the online world, availability is about providing the content. This is also a business that libraries thought they were in, but again I would argue, they really weren’t. [Science Library Pad]

Now, you can make up your own mind about this argument. It highlights for me, though, a slightly different distinction, one between disclosure and discovery, and maybe one comes to a similar concusion via a different route.
If you want something to be discovered it has to be disclosed to a discovery environment. And techniques for effective disclosure are now big business given the steps folks take to have their stuff found in the search engines. If I want people to know that I am a plumber available for hire, I do not simply put a note on my door. I disclose my availability through the yellow pages, the local newspaper, Google ads: all those places where I know that I am going to be discovered. If I am a repository, I disclose what I have available by making metadata available for harvesting under OAI or other approaches, or for crawling by the search engines.
So, if I want the stuff in my library to be discovered by those to whom it will be useful, I have to disclose its existence in those discovery environments that people actually use. Now, yes, it is true. I can expect some of them to find their way to my door – the library catalog or website – but if people are having discovery experiences elsewhere what should I do?
Think about the catalog. Schematically, we can see at least two broad directions as we look at disclosing the existence of library materials by mobilizing more general discovery environments:

Inside out: syndicating services and data. The library wants to project a discovery experience into other contexts. I use ‘syndication’ to cover several ways of doing this. Typically, one might syndicate services or data. In the former case a machine interface is made available which can be consumed by other applications. We are used to this model in the context of Z39.50, but additional approaches may become more common (OpenSearch, RSS feeds, web services, …). How to project library resources into campus portals, or course management systems, has heightened interest here, as has the interest in metasearch. A service might provide a search of the collection, but other services may also be interesting, providing a list of new items for example. The syndication of data is of growing interest also, as libraries discuss making catalogue data available to search engines and others, with links back to the library environment. Several libraries and library organisations are exposing data in this way. And of course, OCLC has been very active in this area with Open WorldCat, where member data is exposed to several search engines. Another variation here is where libraries participate in shared intiatives which generate gravitational pull, OhioLink or Worldcat.org, for example.
Outside in: the leveraged discovery environment. This is a clumsy expression for a phenomenon that is increasingly important, where one leverages a discovery environment which is outside your control to bring people back into your catalogue environment. Think of Amazon or Gooogle Scholar. Now this may be done using fragile scraping or scripting environments, as for example with library lookup or our FRBR (Functional Requirements of Bibliographic Records) bookmarklets. Here, a browser tool may, for example, recognise an ISBN in a Web page and use that to search a library resource. The broader ability to deploy, capture and act on structured data may make this approach more common: the potential use of CoINS (ContextObject in Span) is a specific example here. Basically, an application needs a hook which can connect to the local environment. How this will happen more smoothly is an intriguing question for discussion elsewhere.

As we move forward, disclosure becomes a more important concern. This may not be the best word. But we have to do a better job of ‘disclosing’ what is ‘available’ in the ‘discovery’ environments where people look for things. Hanging a note on the door may not be good enough.
Related entries:

Discovery and disclosure

Generative AI and libraries: seven contexts

The technology career ladder

Presentation: Two Metadata Directions

lorcan dempsey dot net