The FinRead corpus is a subcorpus of FinINTAS. FinRead consists of read-aloud speech from the same speakers whose conversations are included in the FinDialogue subcorpus. The corpus includes audio files (WAV) and phonetic annotation files (Praat TextGrid). FinRead will be made available at http://lat.csc.fi, along with FinDialogue.
The speakers were native Finns from the capital city region in Finland. Ten speakers were 20 to 30 years of age (D1, D2, D4, D6, D7), whereas the rest of the speakers (D8-D12) were between 45-65 years.
The recordings were performed in an anechoic room for speakers F1-F7 and M1-M7 and in a professional recording studio for the speakers F8-F12 and M8-M12. The speech was recorded with combined headphone-microphone sets either to a DAT recorder in the anechoic room or to the computer system in the recording studio.
The speakers were asked to read aloud 1) a short extract of the quasi-orthographic transcript of their own speech in the FinDialogue corpus, 2) a set of edited, normalized sentences produced from the same transcript, 3) a short set of normalized written sentences that were equally read aloud by all the speakers in FinRead.