Research ∕ Learning

Science, data and the record

Lorcan 2 min read

cover_nature_computing_issue.jpgMicrosoft Research in Cambridge, UK, hosted a very interesting event on the future of science: Science 2020.

The resulting document, Towards 2020 Science, sets out the challenges and opportunities arising from the increasing synthesis of computing and the sciences. It seeks to identify the requirements necessary to accelerate scientific advances – particularly those driven by computational sciences and the ‘new kinds’ of science the synthesis of computing and the sciences is creating. Already this synthesis has led to new fields and advances spanning genomics and proteomics, earth sciences and climatology, nanomaterials, chemistry and physics. [2020 Science]

The website has a range of materials and a parallel issue of Nature has also appeared on the topic.
From the report:

4 We highlight that an immediate and important challenge is that of end-to-end scientific data management, from data acquisition and data integration, to data treatment, provenance, and persistence. …

5 Our findings have significant implications for scientific publishing, where we believe that even near-term developments in the computing infrastructure for science which links data, knowledge and scientists will lead to a transformation of the scientific communication paradigm. [Summary. Towards 2020 science. p.8 pdf]

There is also a ‘reader’s guide’ which introduces the main features of the report. It is available as a word file. Interestingly, it raises a question that I have asked in recent talks about the ‘scholarly record’.

With an increased reliance on highly distributed and highly derived data, there is a largely unsolved problem of preserving the scientific record. There are frequent complaints that by placing data on the web (rather than conventional publications or a centralised database), essential information is lost. How do we record how a dataset was derived? How do we preserve the history of a dataset that changes all the time? How do we find the origin of data that has been repeatedly copied between data sources? Such issues have to be resolved to offer a convincing infrastructure for scientific data management. [Towards 2020 Science — A Reader’s Guide, March 2006 – word file]

As the scientific record resides not only in the published results, but in the data and applications, questions about authenticity, provenance and context come to the fore. And questions about the integrity of citation.

More from
University Futures are shaping Library Futures

University Futures are shaping Library Futures

Libraries are not ends in themselves, but serve the interests of the organizations of which they are a part. As university emphasis varies around research, education and career poles, we can expect to see libraries evolve to support those emphases more strongly.
Lorcan 8 min read

Lorcan Dempsey dot net

Deep dives and quick takes: libraries, society, culture and technology

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.