Data in action: Time travelling with butterfly specimens | Digital Collections Programme

A guest blog by Galina Jönsson

From Left: Galina Jönsson in the Museum collection, The Museum’s Data Portal and graph showing scietific paper rates and numbers since 2015.

Digital Collections support over 1000 scientific papers

The Museum’s Data Portal was launched in December 2014 to provide access to Museum collections and research, enabling to explore, download and re-use these data for their own purposes. Museum collections include specimens collected over the last 200 years, a critical time period, during which humans have had a major impact on the distribution of biodiversity.

Since 2015, more than 1000 research papers have cited data from the Data portal and partner platforms like the Global Biodiversity Information Facility (GBIF), covering topics including agriculture, biodiversity, evolution, ecology, species distributions and human health. This blog looks at just one of the studies using Museum data, PhD candidate Galina Jönsson’s research using data to examine how human activity has impacted butterfly populations over the 20th Century.

Insects are declining at alarming rates, but we do not precisely know why. From wasps to butterflies, Galina is looking for answers in the Museum’s pinned insect collection and extending time series to span the period of accelerating human pressures like agricultural intensification and deforestation. ‘At first glance, my results suggested that British insects fared pretty well, but I quickly realised there is much more to this than meets the eye.’

Blame eccentric Victorians or lazy statisticians?

A graph showing rates of butterfly distribution change over the last century

Natural history collections’ pinned insect specimens have revealed fascinating changes over the last centuries but have rarely been used to map how, and why, some species increase while other decrease. Nearly everything we know about insect responses to human activities comes from survey data collected by national schemes like the UK Butterfly Monitoring Scheme (UKBMS), which was launched in 1976. One of the benefits of using survey data is that it is standardised, meaning that all species at a particular location are recorded in the same way, at the same time of the year, for multiple years in a row. This makes it easy to compare how different species change in population or geographical location over the years. In the UK, which is unusually well-documented, our knowledge from such survey data is limited to the period since 1970. This period falls after most large-scale transformations of the British landscape such as the agricultural intensification of the 1950s with its deforestation and increased pesticide-use. As a result, we find ourselves without baselines reflecting the state of biodiversity prior to major human pressures.

In contrast to survey data, museum specimens do go back much further in time to give us these baselines – but they were not systematically sampled. This challenges conventional statistics. Labels inform us where and when specimens were collected, but not how. Just like millennial houseplant enthusiasts, Victorian bug collectors had individual preferences. Some travelled far to collect one specimen of every species, others collected every tiny variation within their favourite species. Some collectors were working scientists, but a lot of the collection comes from amateurs and those that collected as a hobby, so the type of specimens and data that was recorded also varies due to the collector.

Amateur collectors can match or exceed the standards of scientists who were paid for their work. Robert ‘Porker’ Watson (1916-84) was a tax accountant and ‘amateur’ butterfly collector and breeder whose setting of specimens was described as ‘a miracle of perfection’ Aurelian Legacy, British Butterflies and their collectors (Salmon.M.A et al, 2000 p230)

Ambitious digitisation projects are making collections available with the click of a button; and in addition, now, citizen science projects generate enormous amounts of contemporary data in addition to data from collections and systematic surveys. Smartphone applications let anyone submit wildlife sightings in seconds but, just as collections reflect eclectic Victorians, citizen scientists’ preferences introduce their own set of biases to the data. We need new statistical models to extract the valuable yet varied information museum specimens, survey data and citizen science sightings hold, but the models also need to handle their respective biases.

The European hornet (Vespa crabro) and its distribution in the UK over the last 120 years.

Solving the statistical riddle


As a masters student, I naturally felt drawn to solving the statistical riddle and embarked on modelling social wasp trends using the Museum’s collection alongside survey data from the Bees Wasps and Ants Recording Society. Our study extended existing trends by 70 years, and indicated that agricultural intensification drove a 70% decline in English hornets (Vespa crabro) between 1950 and 1970. This was followed by a northward range expansion facilitated by climate change-induced warming. Today, hornets have bounced back to 1950-levels in terms of numbers but are more sparsely distributed over a larger area. Through this study, we demonstrated that specimen data from collections can produce long-term population trends, but thoroughly addressing questions of human influence requires more museum data, both species and specimens per species.

And in flew century-old butterfly specimens, forming the basis of my PhD research. In the interest of honesty, perhaps I should say ‘in flew iCollections’, NHM’s pilot mass digitisation project that digitised over half a million British butterflies and moths.
My current research explores temporal patterns of British butterfly trends across centuries, looking at how the timings of major changes to butterflies coincide with habitat changes, and how species-specific characteristics affect population-level change. There are 59 British butterfly species; another five species have become extinct in the last 150 years. Butterflies are sensitive to temperature and weather conditions, and caterpillars are picky eaters, some accept nothing but one specific host plant. These factors render them particularly vulnerable to, and simultaneously good indicators of, greater habitat and climate changes.

Generalisations hide uncomfortable truths

After a couple of years formulating the perfect model (hint: there is no such thing as a ‘perfect model’), I summarised the trends across all British butterfly species. The preliminary results were surprising. Averaging across species, there has been a 15% decrease since 1900. But we know that humans have extensively altered 75% of Earth’s surface, so this had me wondering – is a 15% decrease over 120 years really that bad?

              Next, I grouped species according to whether they are specialists requiring specific habitats (the picky eaters) or generalist wider countryside species that can use a range of habitats. The generalist species nearly doubled since 1900, whilst the specialists had halved. Separating specialists from generalists also showed that the most dramatic changes occurred before the 1970s baseline that many recording schemes give us. Just like the hornets, specialist butterflies started to plummet around 1950, but in contrast to hornets, they did not recover after 1970. It appears that agricultural intensification in the 1950s triggered the troubling subsequent declines (or at least was the straw that broke the specialist’s thorax). Wider countryside species also began expanding in the 1950s, and this expansion continued into the 2000s.

What is wrong with generalists?


Overall, preliminary results show that we’ve lost around 15% of British butterflies since 1900. Specialised species have plummeted, but generalist wider countryside species are making up for the losses. Sometimes people ask ‘what is wrong with generalists?’ – does it really matter which butterfly species are in the ascendant? It all comes down to biodiversity. The diversity of life on Earth, which we need for human well-being, prosperity and ultimately, survival.


Species richness is the number of different species in an area, a way of measuring biodiversity. When the number of species thriving in an area declines or becomes unbalanced, certain species that are doing well can come to increasingly dominate the area. The species that can’t adapt are put under further pressure from the increasing generalist species eating their food or nesting in their areas. A change to the delicate balance of the species in an area can reduce biodiversity and species richness, cause extinctions and dramatically change ecosystems.

The wall butterfly (Lasiommata megera) distribution change over 20th century


However, is it fair to divide all butterflies into either habitat generalist or specialists? And assume that, within each group, every species shows the same long-term trends? Although a habitat-use separation can give useful indications, the reality is much more complex. For instance, the wall butterfly (Lasiommata megera) has suffered worrying declines despite enjoying a variety of habitats. With rising temperatures, the cold-loving wall butterfly has been forced northwards and risks joining the list of butterflies that are extinct in Britain, when it reaches John O’Groats. Biologists often divide species by habitat-use, but the dramatic decline of the wall butterfly shows us that every species has its own particular quirks, extending beyond habitat-use. In addition to temperature-tolerance, species differ in a number of characteristics like their reproduction strategies (for example, many tiny eggs but few survive or a few huge eggs with high survival), the ease with which they find a mate, and how strong flyers they are (which determines if they can colonise new habitats). I am currently using several such species-specific characteristics to identify combinations of characteristics that predispose species to being particularly vulnerable and give others the ability to rapidly expand.

Natural history collections’ specimens are vital to gather the data needed to extend time series of species’ trends to periods prior to extensive anthropogenic pressures and provide important novel insights into our effects on biodiversity. However, most specimens world-wide are relatively inaccessible to research, hidden away in undigitised collections. Mobilising digitisation projects that provide open access to this important biodiversity data will allow us to refine models, produce more accurate future projections, and make effective conservation decisions to bend the curve of global biodiversity loss.

We would love to hear from you if you are using data from data.nhm.ac.uk please get in touch or stay up to date with Digital Collections news by following us on Twitter and Instagram. Keep up to date with our blog posts for more examples of our data in action.

 If you are spotting butterflies this summer please log your findings on a recording scheme so that researchers like Galina can make use of your work. You can also follow Galina on Twitter to keep up with her research.

What do the Common swift, Cockchafer and Caddisfly all have in common? | Digital Collections Programme

A guest blog by Nicola Lowndes

Adults of these species are attracted to the light of a moth trap of course! In this instance I am not referring to the Common Swift bird (Apus apus) that is seen carrying out impressive aerial displays in summer but instead to the beautiful Common Swift moth (Korscheltellus lupulina).

Continue reading “What do the Common swift, Cockchafer and Caddisfly all have in common? | Digital Collections Programme”

Digitisation uncovers rare specimens that highlight the diversity of sex in nature| Digital Collections Programme

Digitisation enables us to understand exactly what we have in the collection. This can provide updated and accurate collection records, improve estimates for digitising future collections and occasionally uncover the unexpected.  Continue reading “Digitisation uncovers rare specimens that highlight the diversity of sex in nature| Digital Collections Programme”

Digitising Butterfly types of the 21st century |Digital Collections Programme

This slideshow requires JavaScript.

A Guest blog by Robyn Crowther and Blanca Huertas

Some of the Museum’s invaluable butterfly reference material, previously only accessible to a handful of scientists, has been released onto the Museum’s Data Portal. Over 90% of these specimens were designated as types in the 21st Century, but this is the first time that images of many of these species have been freely accessible to the global community.

My type on paper

When scientists describe and name a new species, they aren’t actually describing every individual that belongs to that species. Instead they select one or a few specimens with ‘typical’ characteristics representing a species to write a detailed description. These name-bearing specimens are known as types, and are used as a reference when identifying and grouping other individuals into that species.

Each butterfly and its labels are imaged as part of the digitisation process.
Each butterfly and its labels are imaged as part of the digitisation process.

A type bears not only a name, but a big responsibility. If you want to identify and name specimens you have observed or collected you need to look to the type (or an illustration of it) and compare the key characteristics that make that species unique and different from others. For this reason, types are arguably some of the most important specimens in a collection and a priority for digitisation projects.

Recently, the Museum’s butterfly types have been separated from the main collection into a new seperate collection, making it easier to find, use and reference them. To make these types even more accessible, it was also decided that this collection would be digitised and made available on data.nhm.ac.uk – separate curation first makes digitisation of these collections much more efficient, removing the need to ‘pick and choose’ from many different collections drawers.

Vital statistics

We digitised 1000 specimens, covering 220 species. These specimens were collected from 46 countries, representing all continents. The oldest type in this project was designated in 1939 and the newest in 2017.

What’s in a name?

Digitisation isn’t just about capturing an image of a specimen. Before these butterflies were ready for their close ups, extensive curatorial work was needed to prepare the collection, ensuring that each specimen is associated with the correct taxonomic information (e.g. the species and genus names are correct).

2 butterfly types
The traditional Museum round label with a red border makes specimens instantly recognisable as Holotypes

Among these specimens, we found various examples that illustrated the importance of this digitisation project. For example, six specimens used to describe the species Cacyreus niebuhri, an African species, in 1982, had no identification labels or registration information when they were found in the mixed collections – they had lost their name!

As part of this project, an investigation was mounted to discover the true identity of these six butterfly types. Fortunately, information about when and where the specimens were collected was available on the labels pinned underneath each butterfly, with a small label from the author stating they were part of a type series.

The specimen labels indicated that they were collected in the Republic of Yemen by “T.B. Larsen” in 1980. A former Scientific Associate of the Museum, Dr Torben Larsen was a world renowned expert on butterflies of Africa and wrote many books on the subject. A search of his name, along with the collection event details from the specimen labels, threw up the only book on butterflies written from the area and at the time of the species’ description in 1982. Although the book is currently out of print, “The Butterflies of the Yemen Arab Republic” is available at the Museum library and had been digitised so we were able to search the text. As we knew the family that these butterflies belong to, we were able to find the description and images of the mysterious specimens and their name. Cacyreus niebuhri – named for the 18th century Danish topographer Carsten Niebuhr, one of five men who took part in an ill-fated expedition to Yemen that saw him as the sole survivor.

Further searching online revealed that Larsen’s book is the only place that any images of this species can be found, including recent revisions and websites describing the species. The images included in the book are of a quality that makes it hard to identify important diagnostic characteristics, and resolution is even lower in the digitised copy of the book. Type specimens are the reference material for any specimen identification, so without access to a detailed image, identifying anything as C. niebuhri becomes extremely difficult, leading to misidentifications or no identifications at all. The quality of the images that we have released on data.nhm.ac.uk help to address this problem.

Above left: The Museum’s image of the paratype specimen of Cacyreus niebuhri. Right: The only reference image available for C. niebuhri before this project.

Sharing is caring

By sharing data about our specimens we provide a resource that can be used by the scientific community and the public in a number of ways. One of the reasons museum collections remain such an important scientific resource is because they provide a window into a species’ past, allowing us to compare them over time and space, revealing if and how their distributions have altered with the rapidly changing environment. This all starts with being able to give members of the same species the correct name, so that the comparisons are meaningful.

C. niebuhri, a member of the Lycaenidae family, is endemic to the Republic of Yemen, only occurring on the upper reaches of the wetter mountains of that country. These mountains form part of the Arabian Peninsula ecoregion, a region that supports thousands of unique plants and animals and one that is increasingly under pressure from deforestation and soil erosion. Any work aiming to mitigate these pressures on endemic species needs first to know what species occur in this area so that their populations can be monitored. Comparing individuals currently in the area to a name- bearing type specimen should make this easier.

5 butterfly types
A paratype specimen of the near threatened Dingana alaedeus

Dingana alaedeus is another example of an endemic species that the Museum holds type material for. Commonly known as the Wakkerstroom widow, this butterfly is found only in South Africa’s high altitude grasslands at elevations of about 2,000 meters and classified as “Near Threatened” during the 2013 Conservation Assessment of Butterflies for South Africa. Similar to the previous example there is little information relating to this species online, with the same single image being used on several different online resources. In fact, for most of the 220 species we have digitised during this project the images that we have uploaded to the Museum’s Data Portal are the first and only images to be easily accessible online.

Unlocking the Museum’s collections and making them available to all is the mission behind many of our digitisation projects and is one of the Museum’s strategic priorities. There are over 1.5 billion natural history specimens in collections around the world. They have the potential to play a critical role in addressing the most important challenge that humans face over the next years: how to map a sustainable future for ourselves and our changing planet. To see the butterfly types digitised during this project, and over 4.3 million other specimens, visit the Museum’s Data Portal.

A kaleidoscope of beautiful birdwings

cover page 2

We have completed digitising the Museum’s birdwing butterfly collection. Images of more than 8000 specimens have been released onto the Museum’s data portal for anyone in the world to access. This digitisation project has enabled us to gather accurate information about what we have within our collection and this new online resource will support conservation plans to protect endangered species for the future.

Continue reading “A kaleidoscope of beautiful birdwings”

A swarm of Madagascan moths to join our online collection| Digital Collections Programme

The Madagascan digitisation team, alongside the 5,700+ specimens digitised during this project.
The Madagascan digitisation team, From left to right: Phaedra Kokkini, David Lees, Alessandro Giusti, Alberto Zilli Geoff Martin, Peter Wing and Louise Allan.

We have finished imaging more than 5,700 Madagascan butterfly and moth (Lepidoptera) type specimens in the Museum’s collection. Continue reading “A swarm of Madagascan moths to join our online collection| Digital Collections Programme”

What’s the difference between a moth and a butterfly? | Digital Collections Programme

We are currently digitising the Madagascan Lepidoptera collection, a project that has been supported by John Franks and the Charles Wolfson Charitable Trust.

madagascan drawers
A drawer of Madagascan type specimens

The specimens imaged are ‘Types’ – specimens from which the relevant species was named and described.

Continue reading “What’s the difference between a moth and a butterfly? | Digital Collections Programme”

A Flutter of Data | Digital Collections Programme

iCollections canvas
Examples of some of the Lepidoptera specimens available on the Data Portal.

The final batch of data from the iCollections project has now been released through the Museum’s Data Portal – a total of 260,000 Lepidoptera specimen records, bringing the total number of Museum specimen records accessible on the Portal to just over 3.8 million.

What was iCollections?

In 2013 the Museum started to look at the best way to digitise Butterflies and Moths from the UK and Ireland, a collection estimated at half a million specimens. This was a pilot project to develop quick and efficient ways to digitise large Museum collections.

Digitisation Workflow

During the pilot project we trialled and adapted methods of image capture to suit the specimens, giving us an efficient workflow which can be used to digitise wider pinned insect collections. We place each specimen in a specially designed unit tray, with raised sides where we position the specimen’s labels and add a barcode encoded with the unique specimen number. We place each tray in a light box under a DSLR camera to capture an image containing the majority of specimen data. These images are ingested into a bespoke database, which allows species name and location (within the collection) to be added to the file. The database transcription interface lets us add additional data from labels.

collage
We photograph each specimen and its labels, and data is then added to the record via a transcription interface.

During the iCollections project, we became much more efficient with the time taken to photograph a single specimen, whilst ensuring that the damage to these precious specimens from handling is kept to a minimum. We digitised the entire butterfly collection of over 180,000 specimens and made a significant start on the moths by digitising over 260,000 specimens.

In 2016 we secured further funding to carry on the digitisation of the British and Irish moths with our refined workflow. Once this has been completed, further data will be released on the Data Portal. When complete we will have just over half a million Lepidoptera specimens accessible to anyone in the world with an internet connection. This enhances access to our collection, which traditionally will have been via visits or specimen loans. In some cases the researcher may only require a digital specimen, or the digital records could help a researcher narrow down the scope of what they may want to study on a visit to the museum.

iCollections enabled us to come up with an efficient and bespoke workflow for pinned insects which we have been able to re-use. We have published a paper on the iCollections method, to share this with the natural history community. We have also used the learning from iCollections to start new projects, such as our current project to digitise Madagascan Lepidoptera type specimens.

Why Butterflies and Moths? 

The British Lepidoptera collection contains over half a million pinned specimens collected in the UK and Ireland spanning over 200 years. It includes donations from important collectors of the twentieth and twenty-first centuries. As we digitise the Lepidoptera collections we are georeferencing each record, mapping the distribution of species and revealing collecting trends since the mid-nineteenth century.

By providing access to this unrivalled historical, taxonomic and geographical data we can equip more scientists to conduct new research in new ways. For example, Museum scientists, Steve Brooks et al. have been able to compare butterfly data to historical temperature records and found that 92% of the 51 species emerged earlier in years with higher spring temperatures.

‘The warming climate is already causing butterflies to emerge earlier – and unless their food plants adapt at the same rate, the insects could emerge too early to survive.’ (S.Brooks et al., 2016)

When it comes to digitising Lepidoptera, our digitisers can now process up to 300 a day. They get to see and interact with the specimens up close and become extremely fast with a pair of forceps! Our digitiser Peter Wing told us “My favourite image to digitise was a Monarch Butterfly that was pinned with a sewing needle.” While digitising, we uncover some fascinating stories behind the collection. We have been sharing some of these enlightening moments by using #MothMonday on twitter.

peter
Our digitiser Peter with his favourite specimen

Who’s using our data?

We are on a mission to digitise the Museum collection of 80 million specimens. We want to make available our unrivalled historical, geographic and taxonomic specimen data gathered in the last 250 years available to the global scientific community. These data, along with associated specimen images are released through the Museum’s Data Portal.

Through the Data Portal and those of our partners like the Global Biodiversity Information Facility (GBIF), more than 5.9 billion records have been accessed in over 115,500 downloads since April 2015. Through GBIF we are also able to see which scientists are using our data as part of their papers and through Altmetric how many people are talking about our data online. So far we have been cited in 44 papers and referenced over 100 times online.

The Data Portal currently has around 200 non-museum users each day and contains more than 700,000 species-level (index lot) records and over 90 research datasets uploaded by NHM staff and other institutions. This includes 3D scans, images and audio recordings as well as more traditional data.

Critical information is currently locked away within hundreds of millions of specimens, labels and archives in collections across the globe. Our ultimate goal is to unlock this treasure trove of information so that scientists, researchers and data analysts from around the world can use this information to tackle some of the big questions of our time.

To make use of the Museum’s iCollections data please visit the Data Portal To hear more stories behind the Lepidoptera collection you can follow our #MothMonday content on twitter or keep up to date with the Museum’s digitisation projects on the website.

Digitising the Madagascan Lepidoptera type specimens | Digital Collections Programme

We have started digitising the Madagascan moths and butterflies, a project that has been supported by John Franks and the Charles Wolfson Charitable Trust.

Photo of pinned specimen with barcode
Holotype of the giant orange-tip (Gideona lucasi) butterfly, with accompanying barcode

This project is different from our previous Lepidoptera digitisation as it is only looking at type specimens.

A type specimen (or in some cases a group of specimens) is an example specimen on which the description and name of a new species is based.

Continue reading “Digitising the Madagascan Lepidoptera type specimens | Digital Collections Programme”