Corpus of Age-related Voice Disguise

436 Last view: 2024-04-23

10 Last update: 2021-08-09

Corpus of Age-related Voice Disguise

View resource name in all available languages

Muunnellun puheen korpus

AVOID

Persistent Identifier of this resource:

http://urn.fi/urn:nbn:fi:lb-2018060621

Access location: http://urn.fi/urn:nbn:fi:lb-201901163

This corpus includes normal and age-related disguised speech uttered by 60 native Finnish speakers (31 females and 29 males). The speakers were asked to read the same text fragments several times, in their modal voice and in two disguised voices, first pretending to be an elderly speaker and then pretending to be a child. The texts consisted of the Finnish translations of The Rainbow Passage and The North Wind and the Sun, and two selected English sentences from the TIMIT[1] corpus (SA1, SA2). The corpus includes samples of 78 different sentences per speaker (66 Finnish, 12 English). The speech was recorded simultaneously with a portable recorder with close-talking microphone, and two smartphones applications, yielding a total of 14040 audio files (3 * 4680). The material was recorded in summer 2015 in order to study the effect of voice disguise on automatic speaker recognition.

Access to the corpus requires a personal application, apply here: https://lbr.csc.fi

Further information is available in the following publications:

Rosa González Hautamäki, Md Sahidullah, Tomi Kinnunen and Ville Hautamäki, "Age-Related Voice Disguise and its Impact in Speaker Verification Accuracy", Proc. Odyssey: the Speaker and Language Recognition Workshop, Bilbao, Spain, June, 2016.

Rosa González Hautamäki, Md Sahidullah, Ville Hautamäki and Tomi Kinnunen, "Acoustical and perceptual study of voice disguise by age modification in speaker verification", Speech Communication, Volume 95, December 2017, Pages 1-15, doi: doi.org/10.1016/j.specom.2017.10.002

View resource description in all available languages

Korpus koostuu puhenäytteistä, joissa puhujat lukevat tekstiä ääneen joko normaalilla äänellään tai siten, että he pyrkivät kuulostamaan eri-ikäiseltä henkilöltä. Aineisto sisältää näytteet 60 aikuiselta puhujalta (31 naista, 29 miestä), joista jokainen osallistui kahteen äänitykseen. Kummassakin äänityksessä puhuja luki ääneen kaksi suomenkielistä tekstikatkelmaa ja kaksi englanninkielistä virkettä kerran omalla äänellään, kerran teeskentelemällä vanhusta ja kerran teeskentelemällä lasta. Suomenkielisinä teksteinä olivat "Sateenkaaritarina" ja "Pohjantuuli ja aurinko". Englanninkieliset lauseet oli poimittu TIMIT[1]-korpuksesta (SA1, SA2). Aineisto sisältää jokaisen puhujan näytteet 78 eri virkkeestä (66 suomeksi ja 12 englanniksi). Virkkeet on tallennettu yksitellen WAV-muotoisiin äänitiedostoihin. Puhenäytteet äänitettiin samanaikaisesti sekä kannettavalla tallentimella että kahdella älypuhelimella, joten äänitiedostoja on kaikkiaan 14040 kpl (3 * 4680). Aineisto on kerätty kesällä 2015 hankkeessa, jossa tutkittiin teeskentelyn vaikutusta automaattiseen puheentunnistukseen.

Korpus vaatii käyttöluvan, hae tässä: https://lbr.csc.fi

Lisätietoa aineistosta seuraavissa julkaisuissa:

Rosa González Hautamäki, Md Sahidullah, Tomi Kinnunen and Ville Hautamäki, "Age-Related Voice Disguise and its Impact in Speaker Verification Accuracy", Proc. Odyssey: the Speaker and Language Recognition Workshop, Bilbao, Spain, June, 2016.

Rosa González Hautamäki, Md Sahidullah, Ville Hautamäki and Tomi Kinnunen, "Acoustical and perceptual study of voice disguise by age modification in speaker verification", Speech Communication, Volume 95, December 2017, Pages 1-15, doi: doi.org/10.1016/j.specom.2017.10.002

You don’t have the permission to edit this resource.

DistributionAvailability

Available - Restricted Use

Licence

CLARIN RES

Restrictions: Academic - Non Commercial Use, Attribution, No Redistribution, Other

Attribution Details: Tomi Kinnunen, Rosa González Hautamäki, Md Sahidullah, Ville Hautamäki, Stefan Werner and Maria Bentz (in preparation). Corpus of Age-related Voice Disguise (AVOID) [speech corpus]. Kielipankki - The Language Bank of Finland. URL: http://urn.fi/urn:nbn:fi:lb-2018060621.

Licensors:

Rosa González Hautamäki

Itä-Suomen yliopisto, University of Eastern Finland

Tomi Kinnunen

Distribution rights holders:

University of Helsinki

IPR Holder

Stefan Werner

Tomi Kinnunen

Md Sahidullah

Rosa González Hautamäki

Maria Bentz

Ville Hautamäki

Contact Persons

Tomi Kinnunen

Rosa González Hautamäki

text
audio

Bilingual text corpusLanguages

English (2 Sentences) Finnish (11 Sentences)

Linguality

Linguality type: Bilingual

Multi-linguality type: Other (The Finnish translations of the stories "Rainbow Passage" and "North Wind and the Sun", and two English sentences from TIMIT[1] (SA1, SA2))

Size

13 Sentences

Bilingual audio corpusLanguages

English (720 Sentences) Finnish (3,960 Sentences)

Linguality

Linguality type: Bilingual

Multi-linguality type: Other (Read-aloud versions of the Finnish translations of the stories "Rainbow Passage" and "North Wind and the Sun", and two English sentences from TIMIT[1] (SA1, SA2))

Size

14,040 Files

78 Sentences

Metadata

Created: 05/09/2018

Last Updated: 08/09/2021

Revision: Link to resource group page added

Metadata Creator

Ute Dieckmann

Mietta Lennes

Documentation

How to cite: https://www.kielipan...

Resource group page: http://urn.fi/urn:nb...

License - Lisenssi: https://www.kielipan...

Document Type: Manual

Ohjeet henkilötietoja sisältävien Kielipankin aineistojen käsittelyyn, Guidelines for processing corpora containing personal data in the Language Bank of Finland, http://urn.fi/urn:nb...

Editor: FIN-CLARIN

Document Language: English

Document Type: Other

Tietosuojaselvitys: AVOID, http://urn.fi/urn:nb...

Publisher: FIN-CLARIN

Document Language: Finnish

People who looked at this resource also viewed the following: