New version of the digitized Dialect Atlas of Finnish by Lauri Kettunen
This package includes alternative representations of the digitized version of Lauri Kettunen's Dialect Atlas of Finnish (Kettunen 1940; The first version of the digitized data was prepared by profs. Sheila Embleton and Eric Wheeler (York University, Canada) (Embleton & Wheeler 1997, 2000), and further refined for publication by Jyri Lehtinen under the BEDLAN project (Biological Evolution and the Diversification of Languages). The alternative representation provided here includes the data formatted as (1) multistate character format and (2) binary character format. Both formats resemble the representation of population genetic data, and are thus easier to apply to population genetic analyses than the representation provided through the Fairdata repository. Also the linguistic classification of the collected linguistic traits is provided. The data has been prepared by the BEDLAN research project (Biological Evolution and the Diversification of Languages), which has conducted research of Kettunen's dialect atlas using population genetic techniques (Syrjänen et al. 2016, Honkola et al. 2018, Santaharju et al. in revision). Citation: Santaharju, Jenni, Kaj Syrjänen, Terhi Honkola, Perttu, Seppä, Outi Vesakoski and Unni Leino (submitted revision): New version of the digitized Dialect Atlas of Finnish by Lauri Kettunen. In Digital Humanities in the Nordic and Baltic Countries Publications. Zenodo. kettunen_multistate.csv This file contains the multistate representation of the data. Here different states are coded as integers representing the different map symbols from Kettunen's original dialect atlas (for an example, see, which provides scans of Kettunen's original atlas map pages by Juha Kuokkala). Municipality_number Unique identifier for each data point (municipality). Municipality_name Municipality name. These may not be unique, as several municipalities may have the same name. lon_WGS84 WGS 84 longitude of the municipality centroid. lat_WGS84 WGS 84 latitude of the municipality centroid. 1a - 213c Dialect features present on each map page. The integer at the beginning matches the map page in Kettunen's atlas, so e.g. page 1 starts with 1. The letter following the integer represents different overlapping variants within a single map page. Each page number is suffixed with as many different letters as there are overlapping variants on that page; for instance page 1 has at most 3 overlapping variants for any datapoint, so the table includes three columns (1a-1c). The values are either integers (representing different map symbols from Kettunen's map page legends, with 1 being the topmost variant, 2 being the second variant, and so on), "-" (for absent data) or "NA" (for missing data). kettunen_binary.csv Municipality_number Unique identifier for each data point (municipality). Municipality_name Municipality name. These may not be unique, as several municipalities may have the same name. lon_WGS84 WGS 84 longitude of the municipality centroid. lat_WGS84 WGS 84 latitude of the municipality centroid. 1_2 - 213_16 Dialect features present on each map page. The leftmost integer matches the map page in Kettunen's atlas, so e.g. page 1 starts with 1. The second integer matches the multistate characters found in the cells of kettunen_multistate.csv, and thus reflect the different symbols in Kettunen's map page legends. Again here, 1 represents the topmost box in Kettunen's map page's legend, 2 the second box, and so on. The actual data field contains either 0 (absent), 1 (present) or "NA" (missing). kettunen_map_explanations.csv Map number Unique identifier for each map. The explanation of the map Map name and description of the dialect feature. Level1 - Level5 Hierarchical classification of the dialect feature. References Embleton, Sheila M. and Eric Wheeler, S. 1997. Finnish dialect atlas for quantitative studies. Journal of Quantitative Linguistics 4.99-102. DOI: Embleton, Sheila, M. and Eric Wheeler, S. 2000. Computerized dialect atlas of Finnish: Dealing with ambiguity. Journal of Quantitative Linguistics 7.227-31. DOI: Honkola, Terhi, Kalle Ruokolainen, Kaj Syrjänen, Unni-Päivä Leino, Ilpo Tammi, Niklas Wahlberg, and Outi Vesakoski. 2018. Evolution within a Language: Environmental differences contribute to divergence of dialect groups. BMC Evolutionary Biology 18, no. 132. DOI: Kettunen, Lauri. 1940. Suomen Murteet III A. Murrekartasto. Helsinki: Suomalaisen kirjallisuuden seura. Santaharju, Jenni, Terhi Honkola, Perttu, Seppä, Kaj Syrjänen, Unni Leino and Outi Vesakoski (in revision): Linguistic convergence and its drivers in Finnish dialects. Syrjänen, Kaj, Terhi Honkola, Jyri Lehtinen, Antti Leino and Outi Vesakoski. 2016. Applying population genetic approaches within languages: Finnish dialects as linguistic populations. Language Dynamics and Change 6.235-83. DOI:
Näytä enemmänJulkaisuvuosi
Aineiston tyyppi
Kaj Syrjänen - Tekijä
Leino Unni - Tekijä
Tuntematon organisaatio
Jenni Santaharju - Tekijä
Outi Vesakoski - Tekijä
Perttu Seppä - Tekijä
Terhi Honkola - Tekijä
Zenodo - Julkaisija
Muut tiedot