4 million digital specimens and counting | Digital Collections Programme

This image of Carl Linnaeus has been created from Museum specimens rather than pixels.

The Museum’s Data Portal has passed 4 million specimens, representing around 5% of the Museum’s entire collection.

The Data Portal was launched in December 2014. In addition to Museum specimens, the Data Portal also hosts 5.3 million other research records and over 100 datasets from internal and external authors.  The Portal is a platform for researchers to make their research and collections datasets available online for anyone to explore, download and re-use.

Since 2015, more than 12 billion records have been downloaded during more than 140k download events. More than 140 research papers have made citations of data from the portal and partner platforms like the Global Biodiversity Information Facility (GBIF).

Our current digital collection contains specimen records and associated data, including 3D scans, audiovisual resources and GIS:

Resource count by file type

4) record count by department

“For every scientist that comes to South Kensington to physically visit the Museum’s collections, 10 visit our digital collections, and this proportion is growing each year. The Data Portal has become the largest single gateway to Museum specimens, and use of this freely accessible data, is creating opportunities for research and collaboration that would have been unthinkable just three years ago.” V.Smith, Head of Diversity & Informatics

The four millionth specimen to be released onto the Data Portal was a moth that was part of our project to digitise British and Irish Lepidoptera. The butterfly data that came out of this project was used by a team led by Museum Scientist Steve Brooks. Brooks compared butterfly specimens from the last 140 years to temperature records from the Met Office and found that 90% of butterflies emerge earlier in warmer springs or summers, suggesting recent climate change is having a major impact on these species. The release of this new dataset, including both macro- and micro-moths, allows scientists to examine how Lepidoptera, their food plants and climate are coupled together at a fine scale across the UK and Ireland.

“Looking at how species respond to temperature is essential for understanding the ecological and evolutionary consequences of climate change. Our study would not be possible without data from digitized collections paired with monthly temperature records.” Dr Steve Brooks, 2017

Museum Collections transforming our future

Museum collections are transforming our understanding of the past to enable more accurate prediction of the future. The Museum’s collection dates back more than 250 years and is a unique source of baseline data on historical distributions of flora and fauna. One example of a project that makes use of this information is PREDICTS, a major collaborative study led by Museum scientists. PREDICTS tracks the human impact of land use change, helping policy makes make to minimise the impact of their decisions on indigenous flora and fauna.

“If society takes concerted action, and reduces climate change by valuing forests properly, then by the end of the century we can undo the last 50 years of damage to biodiversity on land.” Dr Andy Purvis, Lead Scientist for PREDICTS

The Museum Data Portal is also instrumental in our contribution to the DiSSCo initiative, an international project to create a common digital gateway to Natural History Collections and coordinate our research and digitisation activities. DiSSCo brings together 114 museums and herbaria from 21 countries  and is a proposal to the European Strategic Forum on Research Infrastructures (ESFRI), which works to improve the use and development of large international research infrastructures.

More collections, More use?

iCollections canvas
British and Irish moths and butterflies from the collection

The Data Portal was created to fulfil the Museum’s commitment to open access and open science, and make its research and collections datasets available online. Data is made available as machine readable linked open data. Our collections dataset also has links to other biodiversity data repositories, including Catalogue of Life and World Register of Marine Species. Interlinking our specimens with other data allows for exciting new possibilities such as big data analysis of species occurrences across the globe and makes our portal one of the few five star linked open data portals in the world.
Anyone in the world can explore, download and reuse the data for their own research, either through the Web interface or the API. The portal, like the Museum collection, is organised taxonomically, but has the potential to be digitally reorganised to suit any research requirement. Though upcoming developments, the Portal development team plan to make this even easier by allowing users to browse the collection through any combination of geospatial, temporal and phylogenetic queries. The team also plan to integrate data from a collections assessment exercises, allowing the user to better understand what proportion of our collections are digitised and make recommendations on what should be digitised in the future.

Data as Art

The image of Carl Linnaeus is just one example of how Museum data can be used to engage a wider audience beyond scientific researchers. This image has been produced using 7,600 images from a pool of images from the portal. Analysing the data used for the image shows us that the largest proportion of the specimens in the image were named by the ‘father of modern taxonomy, Carl Linnaeus himself.

Chart showing which taxonomists names species used in Carl Linnaeus collage.
Chart showing which taxonomists named the specimens used in the Carl Linnaeus collage.

We aim to encourage the widest possible use of our collection, both digitally and physically. A recent survey of 150 Portal users suggests this is being borne out by their experiences with the collections data and is reflected by comments in social media, which include artists using the database to study natural subjects, students using the data to practice machine learning, and educators searching for inspiration and images to use for teaching purposes.

What Next?

This is an exciting period for the Data Portal as we look to add new features and functionality through the recently published software development roadmap. For example, in March 2018, Sketchfab was integrated with the Portal allowing users to view 3D models and visualisations of our specimens right next to the specimen data and images from the collections. This was used to great effect as part of the Darwin’s Fossil Mammals book launch this year.

Skull of Toxodon platensis on Sketchfab

The Toxodon skull is brilliantly demonstrated online. Wonderful. Darwin would be dancing with glee to see it! Certainly for the amateur the resolution is already quite adequate. From the (very good) photo on page 76, I had had some difficulty envisioning the relationship between the cheek bone and the position of the eye and brow. The 3D model clearly shows how far the cheek bone protrudes laterally, presumably to accommodate large jaw muscles. It’s almost as good as holding it in your hand. Do let me know when more models appear”. Andrew Ferguson, Rothamsted Experimental Station

In the future we’re planning to implement more advanced search features, integrating with other services and redesigning parts of the site to improve the user experience.
Help us to celebrate this milestone by checking out our 4 million specimens on the Data Portal. If you are using Museum collection data we would love to hear about it, follow us on twitter to stay up to date with the programme and to tell us how you are using our specimens.