Becoming a master of digitisation | Digital Collections

A guest blog from our Summer placement students Janice Wu and Ying Luo

Janice and Ying at work preparing some mollusc shells for digitisation

The digitisation team started a mass approach to digitising our collections nearly a decade ago. At this time, there were very few museum employees around the country who could claim their job title was “digitiser”. Now, ten years on, the Museum has digitised 5.6 million specimens, the digitisation team has nine full-time digitisers and has been able to host two placement students this summer.

As digitisation is a fairly new career option, unlike becoming a curator, it is often not something that students at university are necessarily considering which is something that we really wanted to change.

Larissa Welton, one of our full-time digitisers who has been supervising our summer placements, started her career at the Museum when she was able to take part in an eight-week placement during her master’s degree in Museum Studies at the University of Leicester. This later led to her finding out about digitisation and applying for a job as a digitiser.

‘I’m so glad I found this path. I get to handle our fantastic specimens every day and work to increase digital access to our collections for researchers and the public alike.’ Larissa says

‘Once I got this job, I wanted to offer other students the same opportunity I had, but in the growing field of digitisation.’

Hosting students has been a valuable and rewarding experience and we hope to continue to do this, whether it’s school work experience students or university placements. This blog is by our two 2023 placement students Janice and Ying who are both masters students in Museum Studies at the University of Leicester. They will share their experiences as they are introduced to digitisation over their eight-week summer placement at the Museum.

Why should we digitise?

An insect in a lightbox for imaging

Natural history collections that date back hundreds of years are often the only place that contains a baseline of biodiversity from before industrialisation, wide scale intensive farming and mining the ocean for important minerals. Because of this, digitising museum collections and setting this data free is vital for those wanting to understand what change has happened and understand how we can work in the future to conserve our biodiversity so that both people and the planet can thrive.

Janice tells us ‘Over the past few years, we have witnessed a series of events, including the global pandemic, the fire of the National Museum of Brazil and of the Notre Dame de Paris. These events made me realise the vulnerability of our global collections and culture. This made me want to preserve and protect museum and heritage collections.’

Digitisation is an impactful approach to preserving and protecting collections – it can sometimes reduce the amount of handling that specimens are subjected to when people are using them for research, and shares these collections with the world, opening up the collection with scientists, students and artists who can be inspired by and use these collection to better the planet.

What is digitisation?

Digitisation is simply the process of creating a digitally discoverable record. This might or might not include an image, but will include important information about a specimen such as what species it is, where and when it was found and if it is a ‘type’ or example specimen from which a species is named. This is important information to release to the world as it helps scientists track how populations have changed over time, for instance if there are changes in the appearance of a species due to climate warming, or to discover if what they have collected on a field trip is a new species.

The digitisation team often refer to the different methods of digitisation as ‘workflows’ which is a specific set of instructions to adhere to when working with a series of like objects that all require similar steps. You can find out more in this blog on a Day in the life of a digitiser. Each time a digitiser starts a new project they are trained in the workflow and given an introduction to the collection before starting official digitisation. During their eight-week placement, Janice and Ying were able to be trained to digitise microscope slides, pinned insects, molluscs and herbarium sheets.

The first workflow our digitisers learn is the microscope slide workflow. This is the quickest workflow that we have and the highest number one digitiser has ever managed to get through in one day is 1525 slides. The Museum has more than 2.5 million microscope slides, which can contain whole small insects, parts of other specimens or tiny fossils.

‘I was surprised to learn that the slide digitisation workflows that have been developed take an average of 16 seconds to image a slide, and they help me to semi-automate slide capture and data editing. This greatly improves efficiency and reduces the potential for human error.’ Ying

Janice ‘This is a highly efficient workflow and I can get through 400-600 slides per day which is really satisfying. I find that listening to music as I work helps my concentration.’

Expectation vs. Reality — Even Better!

Before the placement, Janice didn’t interact much with nature. ‘During our placement, I have been exposed to so much variety in the natural world. This placement has shown me how incredible nature can be.’

‘My favourite specimen so far is Chan’s megastick. This stick insect measuring 567mm is the second largest insect in the world (it held the world record until 2008) the team had to use a drawer scanner to image this specimen.’

‘After being a digitiser, I realised that each person on the team can contribute a little to the bigger impact every day. I’m really proud that what I do on a daily basis, can benefit society for centuries to come.’

Ying tells us ‘I love how well preserved the specimens are in the collection – the butterflies still have their brilliant colours and fragrant plants like thyme retain their characteristic smell over 100 years later.’

‘My favourite part of the role so far has been being able to meet new people both in the digitisation lab and wider Museum. I enjoy that each project is a collaboration between the digitisation team and the curators. I already know that eight weeks are not enough for me and I will be looking to join the Museum again in the future.’

The Future is Bright and digital

Janice ‘I feel very lucky to have been able to work with such a welcoming team, and our supervisor, Larissa has been amazing. I would be thrilled if I could continue into digitisation work and contributing to make collections accessible to everyone in my future career. If you ever get the opportunity to become a digitiser, seize it!’

Ying ‘I was surprised by the large amount of scientific terminology that needs to be learnt for this role – as a member of the museum’s collections team, you need to have a deep understanding of the collections. Some of our digitisers are themselves experts in insects, fossils or plants. The placement has opened my eyes to the intersection of multiple disciplines in a museum setting and that digitisation is a specialism.’

We want to thank Ying and Janice for spending time with us this Summer, it has been a pleasure to get to know you and see you excel in so many areas during such a short few weeks.

If you want to stay up to date with our digitisation activities, projects and opportunities please follow us on Twitter and Instagram.

If you are a student or course leader that wants our help to support learning more about digitisation during the life of your course, please reach out to us on either of these channels as we would be happy to hear your ideas.

Crime and punishment while collecting insects? | Digital Collections

National Insect Week blog by Louise Berridge

Rev. Alfred Edwin Eaton (1844-1929) – photograph reproduced with permission,
from the collection of the Royal Entomological Society

Most of the Museum’s digitisers have encountered the entomologist Alfred Edwin Eaton (1844-1929) through his specimen labels. We are transcribing the labels of three groups of freshwater insects that Eaton often collected, the Museum’s Ephemeroptera, Trichoptera, and Plectoptera collections. Eaton’s handwriting can be challenging to decipher (see examples below). We often need to share pictures in our Team’s online chat to try to work out difficult labels together. We reached a point where a new tricky Eaton label was an exciting event and finally cracking his handwriting was a relief… until the next label!. We were all imagining what sort of a person Eaton might have been.

When we heard a story that Eaton had been arrested on 22 June 1896 in Algeria while collecting insects, I wanted to know more…

Eaton travelled to Algeria in 1892 as no European entomologist had yet documented Algerian Trichoptera. He found the country such an interesting place of study that he remained there for several years, with only occasional return visits to England.

At noon on 22nd June 1896 Eaton was searching for Neuroptera (net-winged flies including lacewings, mantidflies and antlions) at Forêt d’ElOubeïra, close to the Algerian border with Tunisia, when some farm workers noticed his dishevelled appearance and unusual insect-collecting equipment. The workers thought Eaton should be questioned by their local Sheikh in case he was a spy or ‘suspicious character’. There was a language barrier – at this time Algeria was under colonial rule by France: Eaton could speak French, but the Algerians he met that day were speaking Arabic, which Eaton did not know.  Eaton shared his lunch with the men and then followed them at their suggestion, wandering off on the longest possible route to their village so that he could explore the area’s swamp and collect more insects. Clearly the arrest was not as terrifying as it might have been.

When they reached the village, which Eaton names as Aine Kriar, Eaton met the Sheikh who decided that he should be questioned by the nearest French Administrator, 12 miles away at El Kala (French: La Calle), where there was a jail. Eaton does not name him, but a contemporary French Almanach National lists the Administrator of El Kala as a man named Moreau. Eaton protested so loudly at being asked to share a horse with the Sheikh’s man who was supposed to accompany him to see Moreau, that eventually the Sheikh gave up and let Eaton ride a separate horse.

Eaton and his companion did not need to travel the full 12 miles to El Kala– they met with Moreau out travelling on their route. Eaton bolted his horse ahead to meet Moreau in order to reach ‘safety.’ Eaton showed the French official a jar of dragonflies to illustrate what he had been doing. Moreau was not very impressed, explaining that there had been a lot of robberies in the area recently and that was why the locals were wary of strangers, but he asked Eaton if he had been treated badly. Moreau wondered why Eaton had not got any personal identification papers or travel permits, to which Eaton responded that he hadn’t known he needed any. Moreau let him go. Reading between the lines of Eaton’s account, it sounds like the men who arrested him treated him with a lot of patience. Eaton’s conclusion was that it all worked out fine and he managed to get back to where he was staying at Le Tarf in time for dinner.


One of Eaton’s dragonflies collected at Aine Kriar on 22nd June 1896 [NHMUK013388821]
And another [NHM013388822]

Eaton’s Dragonflies

Thanks to detective work by Dan Hall in the Museum’s small orders collection, we know that some dragonflies which Eaton showed Administrator Moreau on his arrest day are at the NHM, courtesy of his friend Robert McLachlan’s collection. These examples are Sympetrum sanguineum (Müller, 1764), also known as ruddy darters, a cosmopolitan species which is found all over Europe and the Mediterranean. When alive, the males are tomato-red and the females are an ochre colour (see image below). Eaton’s specimens have faded in colour – which would have happened a day or two after he collected them – but they are still in good condition.

Ruddy Darter male Sympetrum sanguineum (32455377778).jpg
RSPB Lakenheath Reserve, Suffolk. TL720867
gailhampshire from Cradley, Malvern, U.K

Knowing the story of how these specimens were once used as a ‘get out of jail free card’ by an eccentric entomologist certainly add another dimension to digitising them. Eaton’s story will also help us to georeference his specimens from Algeria as we have learned more about his travels.

It would not be recommended or possible today to travel and collect overseas without id and travel documents; today you would usually need specimen collecting permits and an environmental impact assessment might be necessary too. Eaton may not have even been travelling with a passport, as this was not always strictly enforced before the First World War.

Eaton was at least not freewheeling with his collections data, and formally recorded occurrence information for quite a lot of insect species in Algeria for the first time. Eaton sent Algerian Hydroptilids (a family of Trichoptera) he had collected to the Scottish entomologist Kenneth J Morton who found they were significant not because they were new species, but because most of the same species had been found across Europe, and so he learned that they had a wider distribution than previously known. Eaton’s labels still form a baseline of information which help us to monitor how insects are affected by environmental changes over a long period of time, which is why it is worth transcribing them, even if it can be difficult sometimes.

With thanks to Rosemary Pearson and the Royal Entomological Society for permission to use Eaton’s portrait, Dan Hall for tracking down Eaton’s dragonflies in the small orders collection and Krisztina Lohonya for help with transcription and georeferencing.

Find out more from our digitisers and the stories they unravel while digitising 80 million specimens by following us on Instagram and Twitter. If you want to read more about Eaton’s tale you can access his own account of his arrest on the Biodiversity Heritage Library.

A Decade of Digitisation | Digital Collections

A guest blog by Pete Wing

Pete Wing with fellow digitiser Phaedra Kokkini and the Madagascan Lepidoptera they digitised.

January 2023 marks the tenth anniversary since I first started digitising the Museum’s collections. A lot has changed in that time but the main principle of digitisation has remained the same: to transform the access and use of the Museum’s collections through unlocking natural history data and sharing this with the world.

Continue reading “A Decade of Digitisation | Digital Collections”

Digitising beans to feed the world

Legumes are a group of plants that include soybeans, peas, chickpeas, peanuts and lentils. They are a significant source of protein, fiber, carbohydrates, and minerals in our diet and some, like the cow pea, are drought resistant.

A new paper has been published in Biodiversity Journal about a project that the digitisation team started in 2018 with the Royal Botanic Gardens Kew (project Lead) and the Royal Botanic Garden Edinburgh, to collectively digitise non-type herbarium material from the legume family This includes rosewood trees (Dalbergia), padauk trees (Pterocarpus) and the Phaseolinae subtribe that contains many of the beans cultivated for human and animal food.

This project was made possible through Department for Environment Food & Rural Affairs (DEFRA)-allocated Official Development Assistance (ODA) funding, distributed by the UK government in its “global efforts to defeat poverty, tackle instability and create prosperity in developing countries”.

ODA-listed Countries

AfricanGuinea, Ethiopia, Sudan, Kenya, Uganda, Tanzania, Mozambique, Malawi and Madagascar
AsianBangladesh, Myanmar, Nepal, New Guinea and India
Southern and Central AmericanGuatemala, Honduras, El Salvador, Nicaragua, Bolivia, Argentina and Brazil

The legume groups Dalbergia, Pterocarpus and Phaseolinae were chosen for digitisation to support the development of dry beans as a sustainable and resilient crop, and to aid conservation and sustainable use of rosewood and padauk trees. Some of these beans, especially cow pea and pigeon pea, are sustainable and resilient crops, as they can be grown in poor-quality soils and are drought stress resistant (Varshney et al. 2009). This makes them particularly suitable for agricultural production where the growing of other crops would be difficult.

Digitally discoverable herbarium specimens can provide important information about the distribution of individual species, as well as highlighting which species occur naturally together. While there have been collaborative efforts between herbaria in the past, these have tended to prioritise digitisation of type specimens – as the example specimens for which a species is named, types are important to identification, but as individual specimens don’t offer insights into species distribution over time. By focusing on the non-types across the world and over the last 200 years, we have released a brand-new resource to the global scientific community.

Searching for beans

This collection was digitised by creating an inventory record for each specimen, attaching images of each herbarium sheet, and then transcribing more data and georeferencing the specimens, providing an accurate locality in space and time for their collection.

We originally had four months and three members of staff to digitise over 11,000 specimens. The Covid-19 lockdown was ironically rather lucky for this project as it enabled us to have more time to transcribe and georeference all of the records.

Map showing breakdown of records by country

We were able to assign country-level data to 10,857 out of the total number of 11,222 records. We were also able to transcribe the collectors’ names from the majority of our specimen labels (10,879 out of 11,222). Only 770 out of the 2,226 individuals identified during this project collected their specimens in ODA listed countries. The highest contributors were: Richard Beddome (130 specimens), Charles Clarke (110), Hans Schlieben (98) and Nathaniel Wallich (79). The breakdown of records by ODA country can be seen in the chart below.

From our data, we can see the peak decade of collection was the 1930s, with almost half (4,583 specimens or 49,43%) collected between 1900 and 1950 (Fig. 10). This peak can be attributed to three of our most prolific collectors: Arthur Kerr, John Gossweiler and Georges Le Testu, all of whom were most active in the 1930s. The oldest specimen (BM013713473) was collected by Mark Catesby (1683-1749) in the Bahamas in 1726.

Chart showing distribution of records through time

An interesting, but perhaps unsurprising, finding is that our collection is strongly male dominated. There are only two women (Caroline Whitefoord and Ynes Mexia) in the list of our top 50 plant collectors and they are not close to the most prolific collectors. We identified more women in the rest of our records, but their contribution is on average less than 25 specimens per person in the dataset consisting of more than 10,000 specimens. In contrast, the top five male collectors contributed 10% of our collection.

The oldest specimen in this collection from 1726.

Releasing Rosewoods

Both the Pterocarpus and Dalbergia genera include species that are used as expensive good quality timber that is prone to illegal logging. Many species such as Pterocarpus tinctorius are also listed on the International Union for Conservation of Nature (IUCN) Red List of Threatened Species. By releasing this new resource of information on all these plants from three of the biggest herbaria in the world is that we can share this date with the people who are taking care of biodiversity in these countries. The data can be used to could be used to identify hotspots, where the tree is naturally growing and protect these areas. These data would also allow much closer attention to be paid to areas that could be targets for illegal logging activity.

Pterocarpus tinctorius is a species of padauk tree that is listed as endangered on the IUCN Red List. Cowpea (Vigna unguiculata) is a food and animal feed crop grown in the semi-arid tropics.

The ODA-listed countries are economically impoverished and disproportionally prone to be disadvantaged with the changing climate whether from flood or drought or increase in temperature. Using data to identify good, nutritious plant species that can be grown in such conditions can therefore benefit local communities, potentially reducing dependence on imports, aid and on less resilient crops. 

This dataset is now openly available on the Museum’s Data Portal and a paper about this work has been released in Biodiversity Journal. Stay in touch with the Digitisation team by following us on Instagram and Twitter.

Data in Action: British butterflies body size changes in response to climate change | Digital Collections Programme

Drawer of Silver-studded blue (Plebejus argus) butterflies from the Museum’s collection

A brand new scientific paper applies computer vision to over 125,000 of the Museum’s digitised Butterfly collection to understand how animals may respond to climate change.

Continue reading “Data in Action: British butterflies body size changes in response to climate change | Digital Collections Programme”