Lorcan 1 min read

My colleague Thom Hickey has a short article in the current OCLC Newsletter expanding on our experiments with the Beowulf Cluster to speed up processing.

We obtained the machine to investigate parallel text searching. At OCLC we have always searched our databases in parallel, but in as few pieces as we could. In this project we took the opposite approach–to break our database into as many pieces as we could, search each at the same time, and then deal with the coordination needed to return a single result to a searcher. We are finding this works very well for searching, but, more generally, we have found it to be useful for virtually any work with large numbers of bibliographic records. WorldCat now contains well over 55 million records, even accounting for records that have been deleted and merged over the years. Since our cluster has 24 separate nodes with a total of 48 processors, we typically get 30-fold speedups in processing, and occasionally much more than that because the entire database can be cached in main memory. [research [OCLC]]

I mentioned this a while ago in an entry about how developments in hardware were interesting again.

More from
Two metadata directions

Two metadata directions

Metadata practice continues to evolve as research and cultural practices diversify. After a brief environmental view, I discuss two important metadata trends here: entification and pluralization.
Lorcan 16 min read

Follow along

Deep dives and quick takes. Libraries, society, culture, technology, ...

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.