Donate Speech Corpus: Test data from multi-transcriber speakers (10h)

View resource name in all available languages

Lahjoita puhetta -aineisto: Testidata useaan kertaan litteroiduilta puhujilta (10h)


Persistent Identifier of this resource:

This resource is available for download in Kielipankki - The Language Bank of Finland as part of "Donate Speech: Selected dataset",

The resource contains a 10-hour subset of speech from the Donate Speech Corpus. This set includes the smaller set puhelahjat-test-mtr, where each recording was transcribed by four different transcribers, but the set was extended by including all recordings by the same 57 speakers (according to the metadata accompanying the original recordings). The multi-transcriber data was used for testing an ASR system at Aalto University.

For speech technology development purposes, this multi-transcriber speaker dataset can be used together with the smaller puhelahjat-test-mtr set.

You don’t have the permission to edit this resource.