eesti teaduste
akadeemia kirjastus
SINCE 1965
Linguistica Uralica cover
Linguistica Uralica
ISSN 1736-7506 (Electronic)
ISSN 0868-4731 (Print)
Clustering Lexical Variation of Finnic Languages Based on Atlas Linguarum Fenni­carum; pp. 161-184

Terhi Honkola, Jenni Santaharju, Kaj Syrjänen, Karl Pajusalu

The article focuses on lexical relations of the Finnic languages. Here we studied whether lexical data is suitable for detecting the coarse-grained and fine-grained substructure within the Finnic group. We evaluated this by clustering old lexical variation from a dialectal dataset covering the whole Finnic speaker area (Atlas Linguarum Fennicarum; ALFE) using quantitative methods adopted from population genetics, and by comparing our results to groups suggested by earlier linguistic literature. We found the main lexical division between north-eastern and south-western Finnic. According to our lexical analysis, the Finnic languages are Finnish, North Estonian, South Estonian, Livonian, Karelian, Veps, and Votic-Ingrian. These groups matched well with the earlier suggested divisions, and we concluded that lexical data could be utilised more often in defining linguistic sub-structures, especially in linguistic situations that involve dialect continua.


Alvre, P. 1973, Läänemeresoome aluskeele varasest murdeliigendusest, eriti eesti ja soome keelt silmas pidades. - KK, 151-162.

Anttikoski, E. 1998, Karjalan kirjakielen suunnittelu 1930-luvulla. - The Baltic-Finnic Minorities of the Barents Area and the Literary Language, Tromsø (Nordlyd: Tromsø University Working Papers on Language & Linguistics 26), 118—126.

Anttila, R. 1989, Historical and Comparative Linguistics, Amsterdam-Philadelphia (Current Issues in Linguistic Theory 6).

Ariste, P. 1965, Läänemere keelte kujunemine ja vanem arenemisjärk. - Sõna sõna kõrvale. Paul Ariste teaduslikust tegevusest, Tallinn (Emakeele Seltsi Toimetised 7), 80-105.

Bohling, J. H., Adams, J. R., Waits, L. P. 2013, Evaluating the ­Ability of Bayesian Clustering Methods to Detect Hybridization and Introgression Using an Empirical Red Wolf Data Set. - Molecular Ecology 22, 74-86.

Bowern, C. 2012, The Riddle of Tasmanian Languages. - Proceedings of the Royal Society B: Biological Sciences 279, 4590-4595.

Chambers, J. K., Trudgill, P. 1998, Dialectology, Cambridge (Cambridge Textbooks in Linguistics).

Corander, J., Marttinen, P. 2006, Bayesian Identification of Admixture Events Using Multilocus Molecular Markers. - Molecular Ecology 15, 2833-2843.

Corander, J., Marttinen, P., Mäntyniemi, S. 2006, A Bayesian Method for Identification of Stock Mixtures from Molecular Marker Data. - Fishery Bulletin 104, 550-558.

Corander, J., Marttinen, P., Sirén, J., Tang, J. 2008, Enhanced Bayesian Modelling in BAPS Software for Learning Genetic Structures of Populations. - BMC Bioinformatics 9, 539.

Corander, J., Sirén, J., Arjas, E. 2008, Bayesian Spatial Modelling of Genetic Population Structure. - Computational Statistics 23, 111-129.

Corander, J., Waldmann, P., Sillanpää, M. J. 2003, Bayesian Analysis of Genetic Differentiation between Populations. - Genetics 163, 367-374.

Dunn, M., Levinson, S. C., Lindström, E., Reesink, G., Terrill, A. 2008, Structural Phylogeny in Historical Linguistics: Methodological Explorations Applied in Island Melanesia. - Language 84, 710-759.

Evanno, G., Regnaut, S., Goudet, J. 2005, Detecting the Number of Clusters of Individuals Using the Software Structure: a Simulation Study. - Molecular Ecology 14, 2611-2620.

Fjellheim, S., Jørgensen, M. H., Kjos, M., Borgen, L. 2009, A Molecular Study of Hybridization and Homoploid Hybrid Speciation in Argyranthemum (Asteraceae) on Tenerife, the Canary Islands. - Botanical Journal of the Linnean Society 159, 19-31.

Francis, R. M. 2017, Pophelper: an R Package and Web App to Analyse and Visualize Population Structure. - Molecular Ecology Resources 17, 27-32.

Frog, Saarikivi, J. 2015, De situ linguarum fennicarum aetatis ferreae. Pars I. - RMN Newsletter 9, 64-115.

Gooskens, C. 2007, The Contribution of Linguistic Factors to the Intelligibility of Closely Related Languages. - Journal of Multilingual and Multicultural Development 28, 445-467.

Greenhill, S. J., Wu, C., Hua, X., Dunn, M., Levinson, S. C., Gray, R. D. 2017, Evolutionary Dynamics of Language Systems. - Proceedings of the National Academy of Sciences of the United States of America 114, E8822-E8829.

Grünthal, R. 2015, Vepsän kielioppi, Helsinki (Apuneuvoja suomalais-ugrilaisten kielten opintoja varten 17).

Hammarström, H., Forkel, R., Haspelmath, M. 2018, Glottolog 3.3, Jena.

Honkola, T., Ruokolainen, K., Syrjänen, K. J. J., Leino, U, Tammi, I, Wahlberg, N., Vesakoski, O. 2018. Evolution within a Language: Environmental Differences Contribute to Divergence of Dialect Groups. - BMC Evolutionary Biology 18, 132.

Itkonen, T. 1983, Välikatsaus suomen kielen juuriin. - Vir., 190-229.

Jakobsson, M., Rosenberg, N. A. 2007, CLUMPP: a Cluster ­Matching and Permutation Program for Dealing with Label Switching and Multimodality in Analysis of Population Structure. - Bioinformatics 23, 1801-1806.

Kallio, P. 2007, Kantasuomen konsonanttihistoriaa. - Sámit, sánit, sátnehámit. Riepmočála Pekka Sammallahtii miessemánu 21. beaivve 2007, Helsinki (MSFOu 253), 229-249.

Kallio, P. 2014, The Diversification of Proto-Finnic. - Fibula, Fabula, Fact: The Viking Age in Finland, Helsinki (Studia Fennica Historica 18), 155-168.

Kallio, P. 2015a, The Stratigraphy of the Germanic Loanwords in Finnic. - Early Germanic Languages in Contact, Amsterdam-Philadelphia (North-Western European Language Evolution Supplement Series 27), 23-38.

Kallio, P. 2015b, The Language Contact Situation in Prehistoric Northeastern Europe. - The Linguistic Roots of Europe. Origin and Development of European Languages, Copenhagen (Copenhagen Studies in Indo-European 6), 77-102.

Kettunen, L. 1930, Suomen murteet II. Murrealueet, Helsinki (SKST 188).

Kettunen, L. 1940, Suomen murteet III A. Murrekartasto, Helsinki (SKST 188).

Kettunen, L. 1960, Suomen lähisukukielten luonteenomaiset piirteet, Helsinki (MSFOu 119).

Koponen, E. 1991, Itämerensuomen marjannimistön kehityksen päälinjoja ja kantasuomen historiallista dialektologiaa. - JSFOu 83, 123-161.

Laakso, J. 1991, Itämerensuomalaiset sukukielemme ja niiden puhujat. - Uralilaiset kansat. Tietoa suomen sukukielistä ja niiden puhujista, Porvoo-Helsinki-Juva, 49-122.

Laakso, J. 2001, The Finnic Languages. - The Circum-Baltic Languages. Typology and Contact. Volume 1. Past and Present, Amsterdam-Philadelphia (Studies in Language Companion Series 54), 179-215.

Laanest, A. 1975, Sissejuhatus läänemeresoome keeltesse, Tallinn.

Laanest, A., Jussila, R. 1989, Itämerensuomalainen kielikartasto: kysely­sarja, Helsinki.

Lang, V. 2018, Läänemeresoome tulemised, Tartu (Muinasaja teadus 28).

Lehtinen, T. 2007, Kielen vuosituhannet: suomen kielen kehitys kantauralis­ta varhaissuomeen, Helsinki (Tietolipas 215).

Leino, A., Hyvönen, S., Salmenkivi, M. 2006, Mitä murteita suomes­sa onkaan? Murresanaston levikin kvantitatiivista analyysiä. - Vir., 26-45.

Neophytou, C. 2014, Bayesian Clustering Analyses for Genetic Assignment and Study of Hybridization in Oaks: Effects of Asymmetric Phylogenies and Asymmetric Sampling Schemes. - Tree Genetics & Genomes 10, 273-285.

Ojansuu, H. 1922, Itämerensuomalaisten kielten pronominioppia, Helsinki (Turun suomalaisen yliopiston julkaisuja. Sarja B, osa 1, nro 3).

Pahomov, M. 2017, Lyydiläiskysymys: Kansa vai heimo, kieli vai murre? Väitöskirja (monografia), Helsinki.

Pritchard, J. K., Stephens, M., Donnelly, P. 2000, Inference of Population Structure Using Multilocus Genotype Data. - Genetics 155, 945-959.

Pritchard, J. K., Wen, X., Falush, D. 2010, Documentation for Structure Software: Version 2.3. Structure_Manual_doc.pdf

Puechmaille, S. J. 2016, The Program Structure Does Not Reliably Recover the Correct Population Structure When Sampling Is Uneven: Subsampling and New Estimators Alleviate the Problem. - Molecular Ecology Resources 16, 608-627.

Puza, B. 2015, Bayesian Methods for Statistical Analysis, Canberra.

Rapola, M. 1962, Johdatus suomen murteisiin, Turku (Tietolipas 4).

Raun, A. 1971, Essays in Finno-Ugric and Finnic Linguistics, The Hague (UAS 107).

Reesink, G., Singer, R., Dunn, M. 2009, Explaining the Linguistic Diversity of Sahul Using Population Models. - PLoS Biology 7, e1000241.

Saareste, A. 1938, Eesti murdeatlas. Atlas des parlers estoniens. I vihik, Tartu.

Saareste, A. 1941, Eesti murdeatlas. Atlas des parlers estoniens. II vihik. Tartu.

Saareste, A. 1955, Petit Atlas des parlers estoniens. Väike eesti murdeatlas, Uppsala (Skrifter utgivna av Kungl. Gustav Adolfs Akademien 28).

Salminen, T. 1998, Pohjoisten itämerensuomalaisten kielten luokittelun ongelmia. - Oekeeta asijoo. Commentationes Fenno-Ugricae in honorem Seppo Suhonen sexagenarii, Helsinki (MSFOu 228), 390-406.

Salminen, T. 2007, Europe and North Asia. - Encyclopedia of the World’s Endangered Languages, London, 211-282.

Sammallahti, P. 1977, Suomalaisten esihistorian kysymyksiä. - Vir., 119-136.

Setälä, E. N. 1916, Suomensukuisten kansojen esihistoria. - Maailmanhistoria II, Helsinki, 476-516.

Simons, G. F., Fennig, C. D. 2018, Ethnologue: Languages of the World. 21st Edition, Dallas.

Sulkala, H. 2010, Introduction. Revitalisation of the Finnic Minority Languages. - Planning a New Standard Language. Finnic Minority Languages Meet the New Millennium, Helsinki (Studia Fennica Linguistica 15), 8-26.

Syrjänen, K., Honkola, T., Lehtinen, J., Leino, A., Vesakoski, O. 2016, Applying Population Genetic Approaches within Languages. Finnish Dialects as Linguistic Populations. - Language Dynamics and Change 6, 235-283.

Thomason, S. 2000, Linguistic Areas and Language History. - Languages in Contact, Amsterdam (Studies in Slavic and General Linguistics 28), 311-327.

Tuomi, T., Hänninen, A., Suhonen, S. 2004, Atlas Linguarum Fennicarum 1, Helsinki (SKST 800, Kotimaisten kielten tutkimuskeskuksen julkaisuja 118).

Tuomi, T., Hänninen, A., Viitso, T. 2007, Atlas Linguarum Fennicarum 2, Helsinki (SKST 800, Kotimaisten kielten tutkimuskeskuksen julkaisuja 118).

Tuomi, T., Hänninen, A., Rjagojev, V. 2010, Atlas Linguarum Fennicarum 3, Helsinki (SKST 1295, Kotimaisten kielten tutkimuskeskuksen julkaisuja 159).

Turunen, A. 1988, The Balto-Finnic Languages. - The Uralic Languages. Description, History and Foreign Influences, Leiden, 58-83.

Viitso, T.-R. 1998, Fennic. - The Uralic Languages, London (Routledge Language Family Descriptions), 96—114.

Viitso, T.-R. 2008, Läänemeresoome murdeliigenduse põhijooned. - Liivi keel ja läänemeresoome keelemaastikud, Tartu-Tallinn, 63-69.

Weijnen, A. 1975, Atlas Linguarum Europae (ALE). Introduction, Assen.

Бубрих, Д. В., Беляков, A. A., Пунжина, А. В., 1997, Диалектологический атлас карельского языка. Karjalan kielen murrekartasto, Helsinki (Kotimaisten kielten tutkimuskeskuksen julkaisuja 97).

Back to Issue