The Yle MeMAD Media Corpus

The corpus contains tv programs and videos from the archives of Yle, The Finnish Broadcasting Company. Journalistic programs (news, current affairs etc, no drama) have been selected on various topics and from time period ranging from 1966 to 2018. Each browse-quality video file is accompanied with their descriptive metadata and subtitles. Main audio and subtitle languages are Finnish and Swedish with some content in English also. Current size of the corpus is 235 hours of video (2018-12-31).

The corpus has been created and licensed for the MeMAD project, running in 2018-2020 and it will be updated as the project progresses. Corpus use outside the MeMAD project needs to be licensed separately.

MeMAD project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 780069.

You don’t have the permission to edit this resource.