The Suomi 24 Corpus (2017H2)
View resource name in all available languages
Suomi 24 -korpus (2017H2)
Persistent Identifier of this resource:
The corpus is available in Kielipankki - the Language Bank of Finland, download: http://urn.fi/urn:nbn:fi:lb-2019010801. License details: http://urn.fi/urn:nbn:fi:lb-20150304151
The corpus contains all the texts available in the Suomi24 API from the discussion forums of the Suomi24 online social networking website from 1.1.2001 to 31.12.2017. The tokenized version was created and the annotation process was then carried out by Jussi Piitulainen.
Researchers who have a user name and a password can download the entire corpus in the VRT format.
NB! 2019-09-02 Discrepancies in dependency parses: The dependency parses In Suomi24 Corpus 2017H2and relations differ significantly from the parses in other corpora parsed earlier with the same parser. We are investigating the issue. If you need dependency parse information, we recommend using Suomi24 2016H2.
- Turku Dependency Treebank parser