eesti teaduste
akadeemia kirjastus
The Yearbook of the Estonian Mother Tongue Society cover
The Yearbook of the Estonian Mother Tongue Society
Impact Factor (2022): 0.3
PDF | doi:10.3176/esa61.10

Ann Siiman

How the choice between the singular long and short illative case is related to morphosyntactic and semantic variables – which material and methods are suitable for a corpus analysis

The article examines the choice between the singular long and short illative case. It attempts to find out which morphosyntactic and semantic variables are statistically significant for choosing the long or short illative case. The variables considered are part of speech, part of sentence, government, fixed word combination, proper or common noun, proper noun semantic group, common noun semantic group and meaning of the verb lemma.
The material investigated comes from the Estonian web corpus etTenTen and the Estonian Reference Corpus. In the final material, a total of 840 illative case forms were analysed. The material was balanced (420 long illative forms and 420 short illative forms) and all word forms were included only once. With the help of the computer software program R, a chi-square test and standardized Pearson residuals were performed. The results were controlled with a so-called part-whole method and using Cramér’s V effect size method.
Based on the quantitative and qualitative analyses, five variables were found to be important: government, fixed word combination, proper or common noun, proper noun semantic group and common noun semantic group. It was found that the government structures, proper nouns and the levels ‘person names’ and ‘place names’ from the variable proper noun semantic group prefer the long illative case. The short illative case is more preferred with fixed word combinations and the levels ‘place phrases’ and ‘state phrases’ from the variable common noun semantic group. In this univariable corpus-based study, the variables government, fixed word combination, proper or common noun, proper noun semantic group and common noun semantic group were statistically significant, but further studies should test if these variables remain statistically significant when applying multivariable analysis or in experiment-based studies.


Agresti, Alan 2013. Categorical Data Analysis. 3rd Ed. New Jersey, Hoboken: John Wilew& Sons Inc.

Arppe, Antti 2008. Univariate, Bivariate, and Multivariate Methods in Corpus-based Lexicography – a Study of Synonymy. University of Helsinki. Helsingi: Helsinki University Print.

Atkins, Beryl T. S., Beth Levin 1995. Building on a corpus: A linguistic and lexicographical look at some near-synonyms. – International Journal of Lexicography 8 (2), 85–114.

Cohen, Jacob 1988. Statistical Power Analysis for the Behavioral Sciences. 2nd edition. Hillsdale: Lawrence Erlbaum Associates.

Cohen, Jacob 1992. A power primer. – Psychological Bulletin 112 (1), 155–159.

EKG I = Mati Erelt, Reet Kasik, Helle Metslang, Henno Rajandi, Kristiina Ross, Henn Saari, Kaja Tael, Silvi Vare 1995. Eesti keele grammatika. I. Morfoloogia. Sõnamoodustus. Peatoim. Mati Erelt, toim. Tiiu Erelt, Henn Saari, Ülle Viks. Tallinn: Eesti Teaduste Akadeemia Eesti Keele

EKG II = Mati Erelt, Reet Kasik, Helle Metslang, Henno Rajandi, Kristiina Ross, Henn Saari, Kaja Tael, Silvi Vare 1993. Eesti keele grammatika. II. Süntaks. Lisa: Kiri. Peatoim. Mati Erelt, toim. Tiiu Erelt, Henn Saari, Ülle Viks. Tallinn: Eesti Teaduste Akadeemia Keele ja Kirjanduse Instituut.

EKK = Mati Erelt, Tiiu Erelt, Kristiina Ross 2007. Eesti keele käsiraamat. Kolmas, täiend. trükk. Tallinn: Eesti Keele Sihtasutus.

EKSS = Eesti keele seletav sõnaraamat 2009. „Eesti kirjakeele seletussõna­raamatu“ 2., täiend. ja parand. tr. Toim. Margit Langemets, Mai Tiits, Tiia Valdre, Leidi Veskis, Ülle Viks, Piret Voll. Tallinn: Eesti Keele Sihtasutus.

Gilquin, Gaëtanelle, Stefan Th. Gries 2009. Corpora and experimental methods: a state-of-the-art review. – Corpus Linguistics and Linguistic Theory 5 (1), 1–26.

Hasselblatt, Cornelius 2000. Eesti keele ainsuse sisseütlev on lühike. – Keel ja Kirjandus 11, 796–803.

Japkowicz, Nathalie, Shaju Stephen 2002. The class imbalance problem: A systematic study. – Intelligent Data Analysis 6, 429–449.

Kaalep, Heiki-Jaan 2009. Kuidas kirjeldada ainsuse lühikest sisseütlevat kasutamisandmetega kooskõlas? – Keel ja Kirjandus 6, 411–425.

Kio, Kati 2006. Sisseütleva käände kasutus eesti kirjakeeles. Magistritöö. Käsikiri Tartu Ülikooli eesti keele osakonnas.

Klavan, Jane 2012. Evidence in Linguistics: Corpus-linguistic and Experimental Methods for Studying Grammatical Synonymy. (= Dissertationes linguisticae Universitatis Tartuensis 15.) Tartu: University of Tartu Press.

Metslang, Ann 2015. Ainsuse pika ja lühikese sisseütleva valiku olenemine morfofonoloogilistest tunnustest – korpusanalüüs. – Emakeele Seltsi aastaraamat 60 (2014). Peatoim. Mati Erelt. Tallinn, 127–147. http://dx.doi:10.3176/esa60.06.

Mäearu, Sirje 2011. Valik rektsioone. Tartu: Keelehooldekeskus. (23.09.2015).

Raag, Virve 1998. The effects of planned change on Estonian morphology.
(= Studia Uralica Upsaliensia 29.) Uppsala.

R Development CoreTeam 2013. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. (15.04.2016).

Sõnajalg, Ingrid 1956. Illatiiv eesti kirjakeeles. Diplomitöö. Käsikiri Tartu Ülikooli eesti ja üldkeeleteaduse instituudis.

Viitso, Tiit-Rein 1976. Eesti muutkondade süsteemist. – Keel ja Kirjandus 3, 148–162.

Õim, Katre, Asta Õim 2015. Kehaosasõnade arvukategooriast. – Keel ja Kirjandus 2, 88–99.

ÕS 2013 = Eesti õigekeelsussõnaraamat ÕS 2013. Toim. Maire Raadik. Koost. Tiiu Erelt, Tiina Leemets, Sirje Mäearu, Maire Raadik. Eesti Keele Instituut. Tallinn: Eesti Keele Sihtasutus.

Back to Issue