Multilingual Resource Collection of the University of Helsinki Language Corpus Server

645 Last view: 2024-04-26

33 Last update: 2023-09-05

Multilingual Resource Collection of the University of Helsinki Language Corpus Server

View resource name in all available languages

Helsingin yliopiston korpuspalvelimen monikielinen aineistokokoelma

UHLCS

Persistent Identifier of this resource:

http://urn.fi/urn:nbn:fi:lb-201403269

The collection is available in Kielipankki - the Language Bank of Finland (puhti.csc.fi, in the directory /appl/data/kielipankk/mrc-uhlcs/. Access rights instructions: http://www.kielipankki.fi/access).

The UHLCS, maintained by the University of Helsinki, was founded in late 1980. At present, the UHLCS contains computer corpora of more than 50 languages, including samples of minority languages and extensive corpora representing different text types. In 2000, the corpora of the Uralic, Turkic, Tungusic, Mongolic, Chukotko-Kamchatkan, Iranian and North-East Caucasian languages were edited for public use with the financial support of the Max Planck Institute for Evolutionary Anthropology, Leipzig. In summer 2003, metadata descriptions for the corpora were prepared with the financial support of the ECHO project (European Cultural Inheritance Online). There are also tools at the UHLCS which can be used in analyzing the corpora.

UHLCS contains the following corpora:
* Avar
* Chukchi
* Chuvash
* English
* Erzya and Moksha Mordvin (literature, journals)
* Erzya and Moksha Mordvin (word lists)
* Estonian 1
* Estonian 2
* Even
* Evenki
* Finland Swedish Text Corpus (FISC)
* Finnish (Bibles)
* Finnish (literature)
* Ingrian
* Kalmyk
* Khanty (North Khanty) (corpora and translations)
* Komi Zyrian (corpora and texts)
* Komi Zyrian (literature)
* Koryak
* Kurdish
* Lak
* Latin
* Lude (Ludian)
* Nanay
* Nenets (Tundra Nenets)
* North Saami (literature)
* North Saami (Sámikultuvradoaibmagotti smiehttamush)
* Ossete
* Swahili
* Tabassaran
* Tajik
* Turkic languages
* Ume Saami
* Uralic languages

UHLCS has many different IPR holders. Should you have any questions regarding the collection, please contact Pirkko Suihkonen (suihkonen.pirkko@gmail.com).

License details: http://urn.fi/urn:nbn:fi:lb-2015041302

The purpose of the resource use must be outlined in a research plan.

You don’t have the permission to edit this resource.

DistributionAvailability

Available - Restricted Use

Licence

Other

User Nature: Academic

Licensors:

Pirkko Suihkonen

Distribution rights holders:

CSC - Tieteen tietotekniikan keskus Oy , CSC — IT Center for Science Ltd

University of Helsinki

IPR Holder

Pirkko Suihkonen

Contact Person

User support at CSC - IT Center for Science Ltd. The Language Bank of Finland

text

Multilingual text corpusLanguages

Lak Tabassaran Kurdish Ossetian; Ossetic Tajik Avaric Komi Zyrian Latin Chukchi Koryak Finnish Kildin Sami Norwegian Italian English Uzbek Dutch German French Chuvash Ludian Erzya Moksha Khanty Ingrian Tundra Nenets Russian Ume Sami Northern Sami Swedish

Variety: Finland Swedish (Type: Dialect)

Estonian Tatar Udmurt Nanai Evenki Even Kalmyk; Oirat

Linguality

Linguality type: Multilingual

Multi-linguality type: Other

Size

Modalities

Written Language

Metadata

Created: 07/11/2012

Last Updated: 09/05/2023

Metadata Creator

Imre Bartis

Relation

Related Resource: Lude (Ludian) Corpus (UHLCS) http://urn.fi/urn:nb...

Relation Type: HasPart

Related Resource: Chuvash Corpus (UHLCS) http://urn.fi/urn:nb...

Relation Type: HasPart

Related Resource: English Corpus (UHLCS) http://urn.fi/urn:nb...

Relation Type: HasPart

Related Resource: Erzya and Moksha Mordvin Word List Corpus (UHLCS) http://urn.fi/urn:nb...

Relation Type: HasPart

Related Resource: Estonian Corpus 1 (UHLCS) http://urn.fi/urn:nb...

Relation Type: HasPart

Related Resource: Estonian Corpus 2 (UHLCS) http://urn.fi/urn:nb...

Relation Type: HasPart

Related Resource: The Finland-Swedish Text Corpus (UHLCS) http://urn.fi/urn:nb...

Relation Type: HasPart

Related Resource: Finnish Corpus (Bibles) (UHLCS) http://urn.fi/urn:nb...

Relation Type: HasPart

Related Resource: Finnish Corpus (Literature) (UHLCS) http://urn.fi/urn:nb...

Relation Type: HasPart

Related Resource: Ingrian Corpus (UHLCS) http://urn.fi/urn:nb...

Relation Type: HasPart

Related Resource: Khanty Corpus (North Khanty, Corpora and Translations) (UHLCS) http://urn.fi/urn:nb...

Relation Type: HasPart

Related Resource: Latin Corpus (UHLCS) http://urn.fi/urn:nb...

Relation Type: HasPart

Related Resource: Lists of Words Corpus (UHLCS) http://urn.fi/urn:nb...

Relation Type: HasPart

Related Resource: Nenets Corpus (Tundra Nenets) (UHLCS) http://urn.fi/urn:nb...

Relation Type: HasPart

Related Resource: North Saami Corpus (Literature) (UHLCS) http://urn.fi/urn:nb...

Relation Type: HasPart

Related Resource: North Saami Corpus (Sámikultuvradoaibmagotti smiehttamush) (UHLCS) http://urn.fi/urn:nb...

Relation Type: HasPart

Related Resource: Ume Saami Corpus (UHLCS) http://urn.fi/urn:nb...

Relation Type: HasPart

Related Resource: Uzbek-English Dictionary (UHLCS) http://urn.fi/urn:nb...

Relation Type: HasPart

Related Resource: Uralic, Turkic, Indo-Iranian and Mongol languages; languages of Siberia and Caucasia (UHLCS) http://urn.fi/urn:nb...

Relation Type: HasPart

Related Resource: Corpus of Erzya and Moksha Mordvin Literature and Journals and Komi Zyrian Literature (UHLCS) http://urn.fi/urn:nb...

Relation Type: HasPart

Related Resource: Komi Zyrian Corpus (UHLCS) http://urn.fi/urn:nb...

Relation Type: HasPart

Related Resource: Kildin Saami Corpus (UHLCS) http://urn.fi/urn:nb...

Relation Type: HasPart

Related Resource: Quantifiers and Quantification in Finnish and Languages Spoken in the Central Volga–Kama Region (UHLCS) http://urn.fi/urn:nb...

Relation Type: HasPart

Documentation

Document Type: Other

UHLCS Resource group page, http://urn.fi/urn:nb...

Document Language: English

CHANGE LOG: 5.9.2023: version controlled mrc-uhlcs taken into use. The original data is available upon request.

People who looked at this resource also viewed the following: