Khanty Corpus (North Khanty, Corpora and Translations) (UHLCS)

View resource name in all available languages

Hantin korpus (pohjoishantin aineistot ja käännökset) (UHLCS)

Persistent Identifier of this resource:

http://urn.fi/urn:nbn:fi:lb-2014032613

http://islrn.org/resources/156-041-809-270-6

The corpus is available in Kielipankki - the Language Bank of Finland (taito-shell.csc.fi, access rights instructions: http://www.kielipankki.fi/access).

The Khanty computer corpus contains the following sub-corpora:

Khanty, Atlym dialect, 519 words, 3967 characters
Khanty, Kazym dialect, 62766 words, 585659 characters
Khanty, Konda dialect, 1115 words, 10234 characters
Khanty, Nizjam dialect, 17681 words, 259732 characters
Khanty, Obdorsk dialect, 10939 words, 200358 characters
Khanty, Synja dialect, 10939 words, 200358 characters.

The corpora of the Khanty dialects are samples taken from the following text collections:

Rédei, Károly (1968).
Nord-ostjakische Texte (Kazym-Dialekt) mit Skizze der Grammatik.
Gesammelt und herausgegeben von Károly Rédei. Abhandlung der Akademie
der Wissenschaften in Göttingen, philologisch-historische Klasse, dritte Folge 71.
Göttingen.

Steinitz, Wolfgang (1989).
Ostjakologische Arbeiten III. Texte aus dem Nachlass.
Eds.: Hartung, Liselotte, Hauel, Petra, Sauer, Gert & Schulze, Birgitte.
Janua Linguarum, Series Practica 256.
Mouton de Gruyter, Berlin.

Vértes, Edith (1980).
H. Paasonens südostjakische Textsammlungen.
Suomalais-Ugrilaisen Seuran Toimituksia 175.
Suomalais-Ugrilainen Seura, Helsinki.

The corpora are running texts and several corpora are morphologically analyzed. Morphologically encoded words of the texts are in the word-per-line format, and the plain texts are in sentence-per-line format. There are also texts in which the clauses and the sentences are marked with the information about the location of the sentences in the texts.

Khanty, Textbook:
Rugin, R.P. (1990).
Shum jôxan sjun'öng xâtLöt.
(Shchastlivye den'ki na Shum-jugane.) [Onnellisia päiviä Shum-joella.]
Kniga dlja dopol'nitel'nogo chtenija v 3-4 klassax xantyjskix shkol (shuryshkarskij dialekt).
Prosveshchenie, Leningrad.

The text includes six different versions: (1) one version edited in the original form by using the Cyrillic alphabet; (2) the same text as transformed to the Latin alphabet; the same text as translated into (3) Finnish, (4) English and (5) Russian, and (6) the original text in the Latin format as morphologically coded and translated into English.

Children's books:

Life of Jesus in Khanty (the Kazim dialect). (Trial edition).
Translation: Nyomysova, Yevdokiya Andreyevna &
Lozyamova, Zoya Nikiforovna.
ISBN 952-9790-25-2, ISBN 91-88394-97-2. 63 pp.
Institute for Bible Translation.
Stockholm & Helsinki 1995.

Life of Jesus in Khanty (the Kazim dialect). (Second edition).
Translation: Nyomysova, Yevdokiya Andreyevna &
Lozyamova, Zoya Nikiforovna.
ISBN 952-9790-40-6, ISBN 91-88794-83-0. 63 pp.
Institute for Bible Translation.
Stockholm & Helsinki 1997.

The computer corpora on the Khanty dialects, and the textbook were compiled and edited by Merja Salo with the financial support of the Academy of Finland. The adaptation of the texts for public use was done with the financial support of the Department of General Linguistics, University of Helsinki. The books of children were donated to the University of Helsinki by the Institute for Bible Translation, Helsinki and Stockholm.

The Khanty Corpus is a part of the UHLCS corpus collection.

UHLCS has many different IPR holders. Should you have any questions regarding the collection, please contact Pirkko Suihkonen (suihkonen.pirkko@gmail.com).

License details: http://urn.fi/urn:nbn:fi:lb-20150304115
Detailed information:
http://urn.fi/urn:nbn:fi:lb-2014060214
http://www.ling.helsinki.fi/uhlcs/metadata/corpus-metadata/uralic-lgs/ugric-lgs/khanty

The purpose of the resource use must be outlined in a research plan.

You don’t have the permission to edit this resource.