Raamatun jakeita uralilaisille kielille, rinnakkaiskorpus, versio 2, Korp

Kuvaus

This resource will be available via Korp in Kielipankki – the Language Bank of Finland These parallel corpora consist of Biblical verses (historical and contemporate, 1821–2023) from Erzya (myv), Moksha (mdf); Olonets-Karelian (Livvi) (olo), Dvina-Karelian (North Karelian Proper) (krl), Livonian (liv), Veps (vep); Khanty (kca), Mansi (mns); Komi-Permyak (koi), Komi-Zyrian (kpv), Udmurt (udm); Meadow & Eastern Mari (mhr) and Hill Mari (mrj). The majority of the texts, in reference to newer translations, come from the Institute for Bible Translation in Helsinki, Finland as originally organized for the University of Helsinki Language Corpus Server (UHLCS). Finnish, Estonian, Hungarian as well as Russian and Ukrainian translations are also included. The purpose of these parallel corpora is to further the studies of translation in Uralic minority languages. Simultaneously, it provides an opportunity to follow changes in lexical and syntactic strategies used in different versions of Biblical verses in one language or compare lexicon and structure between languages. Lemmatization and morphological analyses are provided for all but Khanty, Estonian, Hungarian, Russian and Ukrainian, and the accuracy in the remaining languages should be developed as disambiguation resources. The minority languages have been lemmatized and annotated for both morphological features and syntactic dependencies with the use of open hfst-analysers developed in the GiellaLT (Clarino) infrastructure. The Finnish texts have been analyzed with TNPP (Turku Neural Parser Pipeline), which includes lemmatization, morphological analysis as well as syntactic annotation. The choice of including two closely related Slavic languages is founded on the idea that historically Slavic contact has been representative of Kiev, Novgorod, St Petersburg and Moscovian idioms spoken colloquially. The 27 books of the New Testament are included for the following 15 languages: est (2022), fin (1932–1938), hun (2021), koi (2019), kpv (2008), krl (2011), mdf (2016), mhr (2007), mrj (2014), myv (1821–1827, 2006), olo (2003), rus (1876), udm (1997, 2013), ukr (2022), vep (2006). The 39 books of the Old Testament are included for two languages: fin (1932–1938) and udm (2013). Additionally, the following books are included: kca (2013–2018): MRK, GEN, JON; koi (1996): MRK; kpv (1995–1997): MRK, JHN; krl (2020–2023): JON; liv (1942): MAT; mdf (1901): JHN; mdf (1995): MRK; mdf (2020–2022): GEN, EXO; mhr (1994–1995): MRK, LUK, JHN; mns (2000–2016): MAT, MRK, LUK, JHN, JON; myv (1910): MAT, MRK, LUK, JHN; myv (1995–1998): MAT, MRK, LUK, ACT; myv (2011–2020): RUT, PSA, ECC, SNG, JON; olo (1993–1997): MAT, MRK, LUK, JHN; olo (2006–2020): GEN, RUT, PSA, PRO, ISA, JON; udm (2016): TOB, JDT, WIS, SIR, BAR, LJE, 1MA, 2MA, 3MA, 1ES, 2ES; vep (1992–1998): MAT, MRK, JHN; vep (2012–2023): RUT, PSA, PRO, JON.
Näytä enemmän

Julkaisuvuosi

2024

Aineiston tyyppi

Tekijät

Helsingin yliopisto - Julkaisija

Axelson - Tekijä

Jack Rueter - Tekijä

User support FIN-CLARIN - Kuraattori

Raamatunkäännösinstituutti ry - Oikeuksienhaltija

Projekti

Muut tiedot

Tieteenalat

Kielitieteet

Kieli

vepsä, karjala, viro, suomi, Unkarin kieli, Hantin kieli, Komipermjakin kieli, Jazvan komi, Liivin kieli, Mokšan kieli, Meadow Mari language, Mansin kieli, Hill Mari language, Ersän kieli, Aunuksenkarjalan kieli, Venäjän kieli, Udmurtin kieli, ukraina

Saatavuus

Avoin

Lisenssi

Creative Commons Nimeä EiKaupallinen 2.0 Yleinen (CC BY NC 2.0)

Avainsanat

Asiasanat

Ajallinen kattavuus

undefined

Liittyvät aineistot