Finnish Wikipedia 2017, Korp
View resource name in all available languages
Suomenkielinen Wikipedia 2017, Korp
wikipedia-fi-2017-korp
Persistent Identifier of this resource:
http://urn.fi/urn:nbn:fi:lb-2018060401
The Finnish Wikipedia 2017 Corpus will be available in the concordance tool Korp.
The corpus contains all the Finnish articles from the online encyclopedia Wikipedia available in 1 January 2018.
The text parts of the articles have been extracted from [Wikipedia Dumps](https://dumps.wikimedia.org/) with [WikiExtractor](https://github.com/attardi/wikiextractor).
The corpus has been tokenized and annotated with morpho-syntactic analysis produced with the [Turku Dependency Parser](http://turkunlp.github.io/Finnish-dep-parser/)
View resource description in all available languages
Aineisto kattaa Wikipedian suomenkielisen artikkelien leipätekstit vuoden 2017 lopulta. Tekstit on eristetty Wikipedian tarjoamista kielikohtaisista kokonaisaineistoista (https://dumps.wikimedia.org/). Aineisto on jaettu arikkeleihin, kappaleisiin ja lauseisiin. Lauseet on morfosyntaktisesti jäsennetty käyttäen Turku Dependenssi -jäsennintä (http://turkunlp.github.io/Finnish-dep-parser/).
People who looked at this resource also viewed the following: