TweetyBERT: Automated parsing of birdsong through self-supervised machine learning. DOI Creative Commons

George Vengrovski,

Miranda Hulsey-Vincent, Melissa A. Bemrose

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2025, Volume and Issue: unknown

Published: April 10, 2025

Deep neural networks can be trained to parse animal vocalizations - serving identify the units of communication, and annotating sequences for subsequent statistical analysis. However, current methods rely on human labelled data training. The challenge parsing in a fully unsupervised manner remains an open problem. Addressing this challenge, we introduce TweetyBERT, self-supervised transformer network developed analysis birdsong. model is predict masked or hidden fragments audio, but not exposed supervision labels. Applied canary song, TweetyBERT autonomously learns behavioral song such as notes, syllables, phrases capturing intricate acoustic temporal patterns. This approach developing models specifically tailored communication will significantly accelerate unlabeled vocal data.

Language: Английский

TweetyBERT: Automated parsing of birdsong through self-supervised machine learning. DOI Creative Commons

George Vengrovski,

Miranda Hulsey-Vincent, Melissa A. Bemrose

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2025, Volume and Issue: unknown

Published: April 10, 2025

Deep neural networks can be trained to parse animal vocalizations - serving identify the units of communication, and annotating sequences for subsequent statistical analysis. However, current methods rely on human labelled data training. The challenge parsing in a fully unsupervised manner remains an open problem. Addressing this challenge, we introduce TweetyBERT, self-supervised transformer network developed analysis birdsong. model is predict masked or hidden fragments audio, but not exposed supervision labels. Applied canary song, TweetyBERT autonomously learns behavioral song such as notes, syllables, phrases capturing intricate acoustic temporal patterns. This approach developing models specifically tailored communication will significantly accelerate unlabeled vocal data.

Language: Английский

Citations

0