Bird Vocalization Generation with DL Thesis Demo

This page presents audio demos from the thesis titled "Bird Vocalization Generation using Deep Learning" by Andrii Shevtsov, supervised by Volodymyr Sydorskyi. The work was completed in a fulfillment of the requirements for the MSc program at the Ukrainian Catholic University, Lviv, Ukraine.

On this page, Ground Truth (GT) samples from 12 random selected bird species are compared with generations from both proposed approaches and baseline methods discussed in the thesis. For each method, one sample is selected from three randomly generated outputs. Ground Truth samples are selected entirely at random. Species are chosen randomly across 3 categories of different occurrence frequency (common, uncommon and rare), similarly to the strategy used in the thesis.

Common species

100+ training samples present.

Amazonian Motmot

GT
StableBird-class
Rectified Flow
StableBird-text
Stable Audio Open 1.0

Red-eyed Vireo

GT
StableBird-class
Rectified Flow
StableBird-text
Stable Audio Open 1.0

Eurasian Collared-Dove

GT
StableBird-class
Rectified Flow
StableBird-text
Stable Audio Open 1.0

Eurasian Skylark

GT
StableBird-class
Rectified Flow
StableBird-text
Stable Audio Open 1.0

Uncommon species

50-100 training samples present.

Scrub Tanager

GT
StableBird-class
Rectified Flow
StableBird-text
Stable Audio Open 1.0

Dusky-capped Greenlet

GT
StableBird-class
Rectified Flow
StableBird-text
Stable Audio Open 1.0

American Golden-Plover

GT
StableBird-class
Rectified Flow
StableBird-text
Stable Audio Open 1.0

Elegant Woodcreeper

GT
StableBird-class
Rectified Flow
StableBird-text
Stable Audio Open 1.0

Rare species

50 or less training samples present.

Gray-headed Chachalaca

GT
StableBird-class
Rectified Flow
StableBird-text
Stable Audio Open 1.0

Andean Motmot

GT
StableBird-class
Rectified Flow
StableBird-text
Stable Audio Open 1.0

Black-faced Cotinga

GT
StableBird-class
Rectified Flow
StableBird-text
Stable Audio Open 1.0

Gray-crowned Rosy-Finch

GT
StableBird-class
Rectified Flow
StableBird-text
Stable Audio Open 1.0