Donate Speech Corpus: Test data from multi-transcriber speakers (10h) 
View resource name in all available languages
Lahjoita puhetta -aineisto: Testidata useaan kertaan litteroiduilta puhujilta (10h)
puhelahjat-test-mtr-s
Persistent Identifier of this resource:
http://urn.fi/urn:nbn:fi:lb-2022060125
This resource is available for download in Kielipankki - The Language Bank of Finland as part of "Donate Speech: Selected dataset", http://urn.fi/urn:nbn:fi:lb-2022060127.
The resource contains a 10-hour subset of speech from the Donate Speech Corpus. This set includes the smaller set puhelahjat-test-mtr, where each recording was transcribed by four different transcribers, but the set was extended by including all recordings by the same 57 speakers (according to the metadata accompanying the original recordings). The multi-transcriber data was used for testing an ASR system at Aalto University.
For speech technology development purposes, this multi-transcriber speaker dataset can be used together with the smaller puhelahjat-test-mtr set.
People who looked at this resource also viewed the following: