In December 2005 I wrote:
Now, one potential advantage of the book mass digitisation initiatives currently underway is that they are potentially creating a ‘book content index’ in the way that the search engines currently have a ‘web content index’. Amazon is opening up a business which makes that ‘web content index’ available to other applications through its APIs. Which leads to an interesting question: Will Amazon open up its ‘search inside the book’ indexes in this way also (or can it)? Or will another player – Google for example – develop such a service? Or … Does anybody yet have a critical mass, or will they soon?
Such a service would be very useful, and if offered in an appropriate way could be integrated into library catalogs or other library services. Indeed, libraries could build vertical applications on top of such a service.
It seems that within a few years we will have a book content index. One of the questions for the library community will be how to use it. Another will be how to make sure that parts of the scholarly or cultural record that are not attractive to current mass digitization initiatives are not rendered less accessible over time because they are not being indexed in this way. [Lorcan Dempsey’s weblog: On demand book search]
I was thinking of this entry when writing about the latest Google announcement last week. Although we now have much more indexed book text online, we don’t (yet?) have the types of interfaces that would allow access to the book text from other services. It will be a pity if books continue to be less accessible than the web.
Aside: I was prompted to do this entry by the fact that the original post popped up in my logs as one of the top entries looked at so far in June. The topic must be in the air ….