Dr Lia Gilmour, Research Manager at Bat Conservation Trust (BCT) and Martin Newman, who is a consultant for the Trust, explains how a combination of cheaper sensors, machine learning and automated workflows built in AWS are growing knowledge of the acoustic landscape and bringing bat conservation to a wider audience.
Bat conservation doesn't intuitively seem like an area where machine learning (ML) algorithms and cloud computing would be hugely important. However, in common with other areas of environmentalism, the ML branch of AI is having a genuinely positive impact by enabling organisations to collect and make sense of vast quantities of data.
Dr Lia Gilmour, Research Manager at Bat Conservation Trust (BCT) explains how ML is proving to be a game changer for collecting and classifying bat echolocation calls.
"There are 17 breeding species of bats in the UK, 11 of which we currently monitor and 5 of those are by acoustic methods. Historically, that involves going out at night with a detector and writing down what you see and hear."
The problem with this is that it assumes a fairly high level of knowledge, and time. What if there was an inexpensive recording device that you could leave outside to gather data passively? Happily, there now is – it's called an Audiomoth. The device, a credit card sized piece of open-source hardware, enables passive acoustic monitoring. It can be left outside in a garden or community space and then collected in after several days and sent back to BCT.
Gilmour explains what happens next:
"We upload high-frequency ultrasound data from the SD card into AWS. That data then gets managed and processed through a machine learning algorithm, which detects whether there's a bat call."
Data can then be extracted and a report generated for the volunteer to let them what, if any bat species, have been detected. BCT has been collaborating with researchers on the sound classification algorithm (Bat Detect2) for some time. It's already having an outsized impact.
"It's a convolutional neural network and we have been instrumental in collecting the training data and helping to develop the algorithm with University College London and Edinburgh University," explains Gilmour.
"The collaboration is really important to us for us to be at the cutting edge of research on classifiers. When you record a bat call it's a very technical process to understand what the species are. This is what we're training the machine learning algorithms to do. It's not an over exaggeration to say that what used to take years, the algorithms are doing in days."
Martin Newman, consultant at BCT adds some statistical detail:
"Last year on one particular project, we processed 66 terabytes of data and generated 34 and a half million bat pulses which would have taken around 150 man years of people listening to recordings to identify manually. It took 10 days. We ran 300,000 individual jobs in an AWS batch job. "
At present, only the analysis is automated, but the next stage of the project which BCT hopes to have in place later this year, is automating the data extraction and reporting part of the workflow. Newman explains:
"We have developed some software where people can upload themselves rather than sending us the Audiomoth. But because we're recording in very high frequencies that means the data is large. A night recording is typically 24 or 25 gigabytes, which on some people's connections will take a long time to upload to S3.
"When it gets to S3, the classifier is run and the results are written to a database which is written in R [an open-source programming language designed for data mining, analytics and visualisation.] The next stage will be to automate the process where we extract the data and give people nice reports about what's happened in their backyard."
Growing volunteer resource
One of the hoped-for outcomes of the sound classification system (SCS) is that it will encourage volunteers and also widen the range of backgrounds that they come from. Most people in the UK live in an urban setting. Gilmour explains more:
"The Nightwatch project is urban focused. It's about people getting involved with and understanding the nature in their local environment and having that kind of ownership on that data. That allows them to use that data to inform their decisions on how they're going to look after their environment.
"We're getting a new generation of bat experts upskilled potentially from their bedroom. Knowledge of bat calls is a real skill in ecology and conservation, and this allows us to tap into a new range of people that might want to connect with nature and get involved to learn about conservation."
A problem experienced across the conservation sector is that volunteers, by definition need some time on their hands, and a majority live rurally. You also have to be able to get out to sites. This has tended to rule out younger people and those who struggle with mobility.
"We have amazing volunteers," says Gilmour, "but we are trying to expand that and get more people involved in conservation and diversify. There are projects trying to involve people from less advantaged backgrounds or those who might not be in work. Woodland Hope in North Wales is centred on temperature rainforest habitats and getting people connected with their woodland heritage and the bats that live there."
Gold standard data quality
One of the biggest challenges in creating a gold standard ML algorithm, is ensuring the quality of data feeding it. Bat calls can create some specific data quality challenges.
"Echo Hub is a bat call library, but also a training hub, which we're developing to be part of the sound classification system," says Gilmour. "We want it to act as a repository for gold standard data because what's really important with the machine learning is a really good training data set. If you're recording a bat, to understand what the species is you have to be sure that you're not recording something else. It has to be done with known roosts and a specific protocol which our community bat experts know all about."
Gilmour envisions Echo Hub acting as an open-source template that other researchers and NGOs can use around the world, especially where less is known about what species they may have. The hub can also be used to upskill more people.
"We're working on a gamification angle," she explains, "so leaderboards and scores so you can rank data quality in background and decide which data goes into training the algorithm."
Long term, the algorithm and SCS infrastructure will monitor all 18 bat species and understand how they're faring with climate change, land use changes, etc.
Gilmour also has plans to scale the classifier and plug in different algorithms.
"Our vision is to be able to plug in many different types of machine learning classifiers, we want this one to be a template. We're hoping to publish and show it as a case study of machine learning infrastructure so we can then develop other biodiversity data applications. Within BCT, we're hoping to plug in a soundscape algorithm so we can look at the health of ecosystems by comparing the ratio of paternic (human-made) sounds to natural sounds.
"We're working with a big partner on that in forestry and understanding how those sorts of areas are used by the general public and the impact of that on those ecosystems."
Gilmour also has ambition for regional datasets because bats have accents.
"We're working on regional classifiers because bats do tend to have accidents we found. In Ireland and the Channel Islands bats sound a bit different! The acoustic landscape can tell us how climate change is affecting ecosystems, and help us predict what might happen."