Description
Sound recognition models allowing to identify sound recordings to species level have been developed for European breeding birds, amphibians, bats and grasshoppers. Data used for training and testing these machine learning models is obtained from online open-access bioacoustics repositories, most notably Xeno-Canto.org. The models aim to contain all species found on the European continent; however, their taxonomic coverage is currently restricted by the availability of training data. For birds, all species are covered, whereas for the other groups some species found in southern or eastern Europe are not included as suitable training data is not yet available. The volume of available training data has increased dramatically in the past few years, partly as a result of activities within MAMBO. The models are intended to be updated annually with the most recent training data, which is expected to lead to a steady expansion of taxonomic coverage over the coming years.
The models allow analysis of sound recordings using three-second time windows, predicting presence or absence (with confidence information) for each window. The models can analyse sound recordings of any length in both WAV and MP3 format. They are not restricted to specific recording equipment and can process recordings from smartphones, relatively inexpensive devices (such as Audiomoths), as well as high-end equipment. Detection and identification performance varies among species and, in addition, also depends on the quality of the recording. The models have been tested in the field using Audiomoths, which are relatively inexpensive devices capable of broadband recording, covering both the audible and ultrasonic ranges up to a sample rate of 384 kHz.
Benefits of using the tool
Many animal species can be most easily, or even primarily, detected through sound. Collecting sound recordings in the field allows to detect the presence of species and can even support the assessment of population trends. However, manually identifying species based on sound recordings is time-consuming work that requires substantial expertise. Sound recognition models can address this limitation by analysing large volumes of sound recordings with high efficiency.
An additional benefit is that the approach enables the collection of well-standardised data across multiple species groups using a single method. Combining passive acoustic monitoring (PAM) with automated species recognition can increase both the accuracy and the volume of ecological data while reducing costs. This results in a highly scalable method that supports standardised biodiversity monitoring on an international scale.
The sound recording models can be deployed both for the analysis of long-duration recordings and to provide identification support for citizen scientists. For the latter purpose, the models will be made available for use in biodiversity portals such as Observation.org and associated mobile apps. Users can opt to run the models locally on their own devices or to access them through an API, for instance through the ARISE infrastructure.
Long-duration recordings can be used for species inventories as well as for long-term monitoring. Such monitoring may rely on fixed recording devices deployed at a single location over extended periods (Passive Acoustic Monitoring, PAM) or on Audio Transect Walks (ATW) in which recordings are collected together with GPS data along a standardised transect.
Target users
The primary audience consists of professionals within the field of ecology and nature conservation; including Natura 2000 officers, reserve managers and consultants and NGOs involved in biodiversity monitoring. Academic researchers and students may also utilise the survey protocols and the machine learning pipeline for broader ecological questions.
Once the models are made available through biodiversity portals and apps, the user base can expand to include tens of thousands of volunteers actively engaged with recording biodiversity. In addition, the models may be applied in nature education, using sound-based identification to make citizens more aware of the species present in their surroundings.
Future development
Although models for all four mentioned species groups will be available from mid-2026 onwards, development work will continue. The models will be updated annually, allowing for the inclusion of additional species as new training data become available. To ensure broad accessibility for the thousands of citizen scientists, biodiversity portals and mobile apps will require modification; these updates are expected to take place in 2026 and 2027. With these improvements, the models can become a primary tool for audio-based biodiversity monitoring, supported by the development of standardised protocols over the coming years.
https://xeno-canto.org/
https://observation.org/
https://www.openacousticdevices.info/audiomoth

The sound recognition models for European animals make audio-based monitoring along transect possible

A great way to visualise sound is by means of a spectrogram. Here, you can see there were multiple species present in this sound clip. The high-pitched calls have been made by a bat, and the lower part comprises three species of grasshoppers (Decticus albifrons, Phaneroptera nana and Uromenus rugosicollis).