Finnish News Corpus for Named Entity Recognition


Persistent Identifier of this resource:

Access location:

The corpus consists of 953 articles (193,742 word tokens) with six named entity classes (organization, location, person, product, event,and date). The articles are extracted from the archives of Digitoday, a Finnish online technology news source.

The data sets are available at and will be available in the download service in Kielipankki – the Language Bank of Finland.

The FiNER system and its technical documentation are available at

You don’t have the permission to edit this resource.