I have been talking to a variety of groups in recent weeks, and the future of the catalog has risen to the top of the list in discussion and questions. The catalog is a topic of major debate. However, this discussion is really raising a set of broader issues about discovery and about the continued evolution of library systems, including the catalog, in a changing network environment.
Several things seem to be going on. Here are some thoughts.
The discovery experience does not have to be tied to the inventory management system. In some ways we have end-to-end integrated library systems where the ends are in the wrong places. At one end, the discovery exerience is embedded in a catalog interface. And, as we now realize, it is often a somewhat flat experience with low gravitational pull when compared to some other discovery environments. At the other end, the ‘fulfilment’ options open out onto only a part of the universe of materials which is available to the user: the local catalogued collection. And there is a growing gap between the cataloged collection and the available collection.
Elsewhere, I have suggested that we can think about some distinct processes – discover, locate, request, deliver – in the chain of use of library materials. Increasingly we will see these sourced as part of separate systems which may be articulated in various combinations, and across material types.
Resolution, for example, is now used to locate instances of discovered items, usually articles. In the future, resolution seems likely to develop into more of a service router: given some metadata, what services are available to me on the resource referred to by the metadata (borrow it, buy it, send it to a colleague, …), or which relate to the metadata itself (export in a particular citation format, for example). It is a way of connecting potentially multiple discovery experiences to multiple fulfilment (request/deliver) services, or multiple other services.
So, discovery of the catalogued collection will be increasingly disembedded, or lifted out, from the ILS system, and re-embedded in a variety of other contexts. And potentially changed in the process. And, of course, those contexts themselves are evolving in a network environment.
What are some of those other discovery contexts? Here are some current examples:
- Local catalog discovery environments. There has been a recent emphasis on the creation of an external catalog discovery system, which takes ILS data and makes it work harder in a richer user interface. The NCSU catalog has been much discussed and admired in this context. Ex-Libris has announced its Primo product which will import data from locally managed collections and re-present it. And, we have just seen announcements about the eXtensible Catalog project at the University of Rochester.
- Shared catalog discovery environments. We also observe a greater trend to shared catalogs, often associated with resource sharing arrangements. It has not been unusual to see a tiered offering, with resources at progressively broader levels (for example: local catalog, regional/consortial, Worldcat). The level of integration between these has been small. However, in recent times we have seen growing interest in moving more strongly to the shared level. This may be to strengthen resource sharing arrangements, to better match supply and demand of materials (the ‘long tail’ discussion), to save resources. And once one moves in this direction, the question of scoping the collective resource in different ways emerges: moving from local to some larger grouping or back.
- Syndicated catalog discovery environments. Increasingly, the library wants to project a discovery experience into other contexts. I use ‘syndication’ to cover several ways of doing this. Typically, one might syndicate a service or data. In the former case a machine interface is made available which can be consumed by other applications. We are used to this model in the context of Z39.50, but additional approaches may become more common (OpenSearch, RSS feeds, …). How to project library resources into campus portals, or course management systems has heightened interest here. The syndication of data is becoming of more interest also, as libraries discuss making catalog data available to search engines and others. And OCLC has been very active in this area with Open WorlCat.
- The leveraged discovery environment. This is a clumsy expression for a phenomenon that is increasingly important, where one leverages a discovery environment which is outside your control to bring people back into your catalog environment. Think of Amazon or Gooogle Scholar. Now this may be done using fragile scraping or scripting environments, as for example with library lookup or our FRBR bookmarklets. Here, a browser tool may, for example, recogize an ISBN in a web page and use that to search a library resource. The broader ability to deploy, capture and act on structured data may make this approach more common: the potential use of CoINS is a specific example here.
Here are some questions which arise whatever the discovery context.
- The user experience – ranking, relating and recommending. There is a general recognition that discovery environments need to do more to help the user. Developers are looking at ranking (using well-known retrieval techniques with the bibliographic data, or probably more importantly, using holdings, usage or other data which gives an indication of popularity), relating (bring together materials which are in the same work, about the same thing, or related in other ways), and recommending (making suggestions based on various inputs – reviews or circulation data for example). Users of Amazon and other consumer sites are becoming used to a ‘rich texture of suggestion’, and we have data to do a better job here. And this leads naturally into the mobilization of user contribution – tagging, reviews – something that may best happen at a shared level.
- The backend – an ILS service layer. If discovery is separated from the ILS, there needs to be a way for the two to communicate. Again, this is currently done through a variety of proprietary scripting and linking approaches. It would be useful to agree a set of appropriate functionality and some agreed ways of implementing it.
- The discovery deficit – the catalogued collection is a part only of the available collection. I am thinking of two related things here. The first is that there will be a growing desire to hide boundaries between databases (A&I, catalog, repositories, etc) in some cases – especially where those boundaries are seen more to reflect the historical contingencies of library organization or the business decisions of suppliers than the actual discovery needs of users. We will see greater integration of the catalog with these other resources, whethere this happens at the applications level (where the catalog sits behind the resolver, or is is a metasearch target), or at the data level (where catalog data, article level data, repository data, and so on, are consolidated in merged resources). This then poses an issue about the data itself. Our catalogs are crated in a MARC/AACR world, with established practices for controlling names, subjects and so on. However, as the catalog plays in a wider resource space, issues arise in meshing this data with data created in different regimes, and accordingly in leveraging the investment in controlled data. Think about personal names for example, where authority control practices apply only to the ‘catalogued collection’. What does it mean when that data is mixed with other data?
- Routing. As we separate functions – discovery from location and fulfilment – we need good ways of tieing them back together. This was addressed above, when talking about resolution. In the longer term, it also is an example of the broad interest converging on directories and registries. In the type of environment I have sketched here, we need registries which manage the ‘intelligence’ that applications need to tie things together. Registries of services (resolvers, deep opac links, z39.50/srw/u targets, …), institutions (complex things ;-), and so on. One wants to be able to tie IP addresses to services (so that you know which services to present to a user), or institutional service points to geographic coordinates (so as to be able to place locations on a map), and so on.
- Sourcing. This is an interesting area which is not yet widely explored in the ILS area. The typical current model is a licensed software model where an instance of a vendor application is run locally. The examples above show some other models: local development, collaborative sourcing, and an on-demand model where the catalog is provided as a network service. Here as in other areas of library systems work, we are likely to see a much more plural approach to sourcing system requirements in coming years.
The catalog discussion is often presented as just that, the catalog discussion. However, it belongs in a wider context. We may be lifting out the catalog discovery experience, but we are then re-embedding it in potentially multiple discovery contexts, and those discovery contexts are being changed as we re-architect systems in the network environment. These systems include discovery systems for other collection types (the institutional repository, or digital asset repository, or …); the emergence of a general search/resolution layer within the library; external environments as different as Google and Amazon, the RSS aggregator, or the course management system. It also includes a variety of supply chains: resource sharing, e-commerce, local.
The catalog questions is a part of how we re-architect the discovery to delivery apparatus for the available collection.
- Search, share and subscribe
- Thinking about the catalog
- Discover, locate .. vertical and horizontal integration
- Systemwide activities and the long tail
- Systemwide discovery and delivery
- A palindromic service layer
- Making data work – catalogs and Web 2.0
(Lifting out, disembedding, re-embedding: I borrow language from Anthony Giddens who uses it in a somewhat loftier context.)