Language identification development and test corpora for Suomi24 and NLF corpora 
li-eval-suomi24-nlf
Persistent Identifier of this resource:
http://urn.fi/urn:nbn:fi:lb-2022021301
Access location:
This corpus includes files for evaluating language identification efficacy on the suomi24-2018-2020 (http://urn.fi/urn:nbn:fi:lb-2021101521) and the new part of the klk-v2 (http://urn.fi/urn:nbn:fi:lb-202009152) corpora.
The lines are random "sentences" from the new material processed by the language bank of Finland during 2021-2022.
The suomi24 originating files are licensed under CC-BY-NC and the klk-v2 originating files under CC-BY.
The lines are random "sentences" from the new material processed by the language bank of Finland during 2021-2022.
The suomi24 originating files are licensed under CC-BY-NC and the klk-v2 originating files under CC-BY.
People who looked at this resource also viewed the following: