ELTE BTK Magyar Nyelvtudományi és Finnugor Intézet

Uiboaed, Kristel (Tartu)

nyomtatható változat

Collostructional analysis of Estonian dialects

This study gives an overview of the first attempt to apply collostructional (blend of collocational and constructional) analyses (Stefanowitsch, Gries 2003; 2005, Gries et. al 2005) on Estonian dialects. Collostructional analysis focuses on the relationships between words and constructions they form (Stefanowitsch, Gries 2003) and adopts the terminology of Construcion Grammar (Goldberg 1995). In the present study, the constructions of non-finite verb form + finite verb form are studied. The aim of the study is to answer the question – which verbal constructions are more common in different dialects based on a certain association measure – Mutual Sensitivity Coefficient (MS) – values (Wiechmann 2008). 
MS is used to calculate the collostructional strength between a finite verb and non-finite verb form in the same clause. Clause boundaries were set automatically using the parser of Estonian which has been adapted for dialect parsing (Lindström, Müürisep 2009). In addition to MS Correspondence Analyses method is applied to find more similar dialects in terms of the studied constructions.
The data comes from the morphologically annotated Corpus of Estonian Dialects (CED) containing the dialect data from all ten dialects of Estonian (altogether over 550 000 tokens). (CED)
In the presentation, an overview over the methods used to extract finite + non-finite verb constructions from CED is given and the similarities and differences between dialect groups are presented. Constructions with highest MS values are studied in more depth to give a short overview of their semantic and morphosyntactic properties.


CED = Corpus of Estonian Dialects. www.murre.ut.ee. [01.09.2010]
Goldberg, Adele E. 1995. Constructions: a construction grammar approach to argument structure. Chicago, IL: University of Chicago Press.
Gries, Stefan Th., Beate Hampe, Doris Schönefeld. 2005. Converging evidence: Bringing together experimental and corpus data on the association of verbs and constructions. Coginitive Linguistics 16(4). 635-676.
Lindström, Liina & Kaili Müürisep. 2009. Parsing Corpus of Estonian Dialects. Proceedings of the 17th Nordic Conference on Computational Linguistics NODALIDA 2009, 14-16th of May Odense, Denmark. Kristiina Jokinen and Eckhard Bick (eds.).
Stefanowitsch, Anatol & Stefan Th. Gries. 2003. Collostrutcions: Investigating the interaction between words and constructions. International Journal of Corpus Linguistics 8(2). 209-243.
Stefanowitsch, Anatol & Stefan Th. Gries. 2005. Corpora and Grammar. In Anke Lüdeling & Merja Kytö (eds.), Corpus Linguistics: an international handbook, vol. 2, 933-951. Berlin & New York: Mouton de Gruyter. 
Wiechmann, Daniel. 2008. On the computation of collostruction strength: Testing measures of association as expressions of lexical bias. Corpus Linguistics and Linguistic Theories 4-2. 253-290.