Systems

Hwaet!

Lorcan 1 min read

My colleague Thom Hickey has a short article in the current OCLC Newsletter expanding on our experiments with the Beowulf Cluster to speed up processing.

We obtained the machine to investigate parallel text searching. At OCLC we have always searched our databases in parallel, but in as few pieces as we could. In this project we took the opposite approach–to break our database into as many pieces as we could, search each at the same time, and then deal with the coordination needed to return a single result to a searcher. We are finding this works very well for searching, but, more generally, we have found it to be useful for virtually any work with large numbers of bibliographic records. WorldCat now contains well over 55 million records, even accounting for records that have been deleted and merged over the years. Since our cluster has 24 separate nodes with a total of 48 processors, we typically get 30-fold speedups in processing, and occasionally much more than that because the entire database can be cached in main memory. [research [OCLC]]

I mentioned this a while ago in an entry about how developments in hardware were interesting again.

Share
Comments
More from LorcanDempsey.net
The technology career ladder
Institutions

The technology career ladder

Library leaders should be drawn from across the organization. Any idea that technology leaders are overly specialised or too distant from general library work is outmoded and counter-productive.
Lorcan 7 min read
icon

Lorcan Dempsey dot net

Deep dives and quick takes: libraries, society, culture and technology

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to LorcanDempsey.net.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.