Fuzzy lexical-semantic relations in Estonian Wordnet
This paper gives an overview of the principles of wordnets in general and focuses mainly on the Estonian Wordnet (EstWN). The latest version of EstWN consists of more than 72,000 concepts and 51 different lexical relations are used to form a network of more than 230,000 semantic relations between concepts.The main relations that EstWN uses are hyperonymy, meronymy, involvement and fuzzynyms (in Princeton WordNet, for example, hyperonymy is the most implemented relation). Of course the richness of different types of relations creates problems and unclear determination of these relations. In case of hyperonyms the developers of EstWN have encountered problems in choosing preferably only one suitable hyperonym for each concept. When dealing with meronymy the more specific relations – involved location, involved direction (both source and target direction) – are inconsistently determined. There are, however, no significant problems with involved instrument and involved agent relations. In PWN there is no involved location of direction relation explicitly available. Meronymy relations are often associated with the problems of connecting encyclopedic concepts to those of general language, for example how to connect the concept ‘bird’ to a specific bird species.
In EstWN the general language vocabulary is well covered, specific domain vocabularies are also incorporated (architecture, medicine, economy etc.) and it would be useful to connect specific vocabulary to general language vocabulary. The paper proposes that the answer to this problem could be the complementary information provided from domain labels. The last semantic relation discussed in this paper deals with fuzzynymy, since this is the third used relation in EstWN. Fuzzynymy is a free association relation, but it is clear that some groups form out of the fuzzynymy relation that can be defined as new specific relations in Estonian.
Recently EstWN has become an increasingly used resource in Estonian language technology, and as such it is important to improve the quality and consistency of relations in addition to increasing the amount of concepts in EstWN in different domains.
Ahlswede, Thomas, Martha W. Evens 1988. A lexicon for a medical expert System. – Relational Models of the Lexicon. Ed. Martha W. Evens. New York: Cambridge University Press, 97–111.
Atkins, Sue, Michael Rundell 2008. Oxford Guide to Practical Lexicography. Oxford: Oxford University Press.
Beckwith jt 1990 = Richard Beckwith, Christiane Fellbaum, Derek Gross, George A. Miller. WordNet. A lexical database organized on psycholinguistic principles. – Using On-line Resources to Build a Lexicon. Ed. Uri Zernik. Hillsdale, NJ: Erlbaum, 211–231.
Bejar jt 1991 = Isaac I. Bejar, Roger Chaffin, Susan Embretson. Cognitive and Psychometric Analysis of Analogical Problem Solving. New York: Springer-Verlag.
Budanitsky, Alexander, Graeme Hirst 2006. Evaluating WordNet-based measures of lexical semantic relatedness. – Computational Linguistics 32 (1), 13–47.
Cruse, Alan D. 1986. Lexical Semantics. Cambridge Textbooks in Linguistics. Cambridge: Cambridge University Press.
Cruse, Alan D. 2002. Lexicology. An International Handbook On the Nature and Structure of Words and Vocabularies. 1. Walter de Gruyter GmbH.
Cruse, Alan D. 2004. Meaning in Language. An Introduction to Semantics and Pragmatics. New York: Oxford University Press.
EKSS = Eesti keele seletav sõnaraamat. 2009. „Eesti kirjakeele seletussõnaraamatu“ 2., täiendatud ja parandatud trükk. Toim. Margit Langemets, Mai Tiits, Tiia Valdre, Leidi Veskis, Ülle Viks, Piret Voll. Eesti Keele Instituut. Tallinn: Eesti Keele Sihtasutus. http://www.eki.ee/dict/ekss/ (30.04.2015).
Evens, Martha W. (ed.) 1988. Relational Models of the Lexicon. New York: Cambridge University Press.
Fellbaum, Christiane 1998. WordNet. An Electronic Lexical Database. Cambridge, MA: MIT Press.
Geeraerts, Dirk 2010. Theories of Lexical Semantics. Oxford: Oxford University Press.
http://dx.doi.org/10.1093/acprof:oso/9780198700302.001.0001.
Hicks, Amanda, Axel Herold 2009. Evaluating Ontologies with Rudify. – Proceedings of the International Conference on Knowledge Engineering and Ontology Development (KEOD ’09), Funchal – Madeira, Portugal, October 6–8, 2009. INSTICC Press, 5–12.
Jiamjitvanich, Kanjana, Mikalai Yatskevich 2009. Reducing polysemy in WordNet. – Proceedings of OM.
Kahusk jt 2010 = Neeme Kahusk, Kadri Kerner, Kadri Vider. Enriching Estonian WordNet with derivations and semantic relations. – Baltic HLT Proceedings: Human Language Technologies – the Baltic Perspective. Riga (Latvia) October 7–8, 2010. IOS Press (Frontiers in Artificial Intelligence and Applications), 195–200.
Kilgarriff, Adam 2000. WordNet. An electronic lexical database. Review. – Language 76 (3), 706–708. http://dx.doi.org/10.2307/417141.
Kunze, Claudia 1999. Semantics of verbs within GermaNet and EuroWordNet. – Proceedings of the workshop at 11th European summer school in logic, language and information. Ed. E. Kordoni, 189–200.
Langemets, Margit 2010. Nimisõna süstemaatiline polüseemia eesti keeles ja selle esitus keelevaras. Tallinn: Eesti Keele Sihtasutus.
Lyons, John 1977. Semantics. 1–2. Cambridge: Cambridge University Press. http://dx.doi.org/10.1017/CBO9781139165693, http://dx.doi.org/
10.1017/CBO9780511620614.
Maziarz, Marek 2011. Semantic Relations among Nouns in Polish WordNet Grounded in Lexicographic and Semantic Tradition. (= Cognitive Studies 11.) http://www.site.uottawa.ca/~szpak/selected_publications_for_download/Wordnet/CS%2011%2010-Maziarz-Piasecki-Szpakowicz.pdf (29.12.2014).
Maziarz jt 2013 = Marek Maziarz, Maciej Piasecki, Stanisław Szpakowicz. The chicken-and-egg probleem in wordnet design: synonymy, synsets and constitutive relations. – Language Resources and Evaluation 47 (3), 769–796.
Meditsiinisõnastik = Meditsiinisõnastik. Eestikeelsed terminid koos seletuste ning ladina, inglise ja soome vastetega. 2004. 2., uuendatud trükk. Toim. Sirje Ootsing, Laine Trapido. Tallinn: Medicina.
Melčuk, Igor, Aleksandr Žolkovsky 1988. The explanatory combinatorial dictionary. – Relational Models of the Lexicon. Ed. Martha W. Evens. Cambridge: Cambridge University Press, 41–74.
Miller jt 1990 = George A. Miller, Richard Beckwith, Christiane Fellbaum, Derek Gross, Katherine J. Miller. Introduction to WordNet. An on-line lexical database. – International Journal of Lexicography 3, 235–312.
Miller, George A. 1998. Nouns in WordNet. – WordNet. An Electronic Lexical Database. Ed. Christiane Fellbaum. Cambridge, MA: The MIT Press, 23–46.
Murphy, Lynne M. 2003. Semantic Relations and the Lexicon. Cambridge: Cambridge University Press. http://dx.doi.org/10.1017/CBO9780511486494.
Niemi, Jyrki, Krister Linden 2012. Representing the translation relation in a bilingual Wordnet. – Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC ’12), Istanbul, Turkey, 21–27 May 2012, 2439–2446.
Oliveira, Hugo Gonçalo, Paulo Gomes 2014. Onto.PT: recent developments of a large public domain Portuguese wordnet. Anthology. – Proceedings of the Seventh Global WordNet Conference (GWC 2014), Tartu, Estonia, January 25–29, 2014. Esd. Heili Orav, Christiane Fellbaum, Piek Vossen. Tartu: Tartu University Press, 16–22.
Orav jt 2011 = Heili Orav, Kadri Kerner, Sirli Parm. Eesti Wordneti hetkeseisust. – Keel ja Kirjandus 2, 96–106.
Pajusalu, Renate 2009. Sõna ja tähendus. Tallinn: Eesti Keele Sihtasutus.
Palmer, Martha 2009. Semlink. Linking PropBank, VerbNet and FrameNet. – Fifth International Workshop on Generative Approaches to the Lexicon (GL 2009). Pisa, Italy, 9–15.
Pedersen jt 2013 = Bolette S. Pedersen, Lars Borin, Markus Forsberg, Neeme Kahusk, Krister Lindén, Jyrki Niemi, Niklas Nisbeth, Lars Nygaard, Heili Orav, Hirkur Rögnvaldsson, Mitchel Seaton, Kadri Vider, Kaarlo Voionmaa. Nordic and Baltic wordnets aligned and compared through „WordTies“. – Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013), May 22–24, 2013, Oslo, Norway. Eds. Stephan Oepen, Kristin Hagen, Janne Bondi Johannessen. (= NEALT Proceedings Series 16, Linköping Electronic Conference Proceedings 85.) Linköping: Linköping University Electronic Press,
147–162.
Piasecki jt 2009 = Maciej Piasecki, Stanisław Szpakowicz, Bartosz Broda. A wordnet from the ground up. Oficyna Wydawnicza Politechniki Wrocławskiej, Wrocław. http://www.plwordnet.pwr.wroc.pl/main/content/files/publications/A_Wordnet_from_the_Ground_Up.pdf (29.12.2014).
Piasecki jt 2013 = Maciej Piasecki, Stanisław Szpakowicz, Christiane Fellbaum, Bolette S. Pedersen. On wordnets and relations. – Language Resources and Evaluation 47 (3), 757–767.
Potsma, Marten, Piek Vossen 2014. What implementation and translation teach us. The case of semantic similarity measures in wordnets. – Proceedings of the Seventh Global WordNet Conference (GWC 2014), Tartu, Estonia, January 25–29, 2014. Eds. Heili Orav, Christiane Fellbaum, Piek Vossen. Tartu: Tartu University Press, 133–142.
Saussure, Ferdinand de 1974 (1916). Cours de linguistique générale. Payot, Lausanne, Paris.
Svensén, Bo 2009. A Handbook of Lexicography. The Theory and Practice of Dictionary-Making. Cambridge: Cambridge University Press.
Šojat, Krešimir, Matea Srebačić 2014. Morphosemantic relations between verbs in Croatian WordNet. – Proceedings of the Seventh Global WordNet Conference (GWC 2014), Tartu, Estonia, January 25–29, 2014. Eds. Heili Orav, Christiane Fellbaum, Piek Vossen. Tartu: Tartu University Press, 262–267.
Tuulik, Maria 2014. Adjektiivide polüseemia korpuses ja sõnaraamatus. – Eesti Rakenduslingvistika Ühingu aastaraamat 10. Toim. Helle Metslang, Margit Langemets, Maria-Maren Sepper. Tallinn: Eesti Rakenduslingvistika Ühing, 307–317. http://dx.doi.org/10.5128/ERYa10.19.
Vider jt 2000 = Kadri Vider, Neeme Kahusk, Heili Orav, Haldur Õim, Leho Paldre. Eesti keele tesaurus. – Arvutuslingvistikalt inimesele. Toim. Tiit Hennoste. (= Tartu Ülikooli üldkeeleteaduse õppetooli toimetised 1.) Tartu: Tartu Ülikooli Kirjastus, 127–152.
Vossen, Piek (ed.) 1998. EuroWordNet. A multilingual database with lexical semantic networks. Kluwer Academic Publishers Norwell.
Vossen, Piek 2002. EuroWordNet General Document. Version 3. Final. July 1, 2002. http://www.vossen.info/docs/2002/EWNGeneral.pdf (15.03.2015).
Werner, Oswald 1988. How to teach a Network. Minimal design Features for a cultural acquisition device or C-KAD. – Relational Models of the Lexicon. Ed. Martha W. Evens. New York: Cambridge University Press, 147–166.
Õim, Haldur 1997. Eesti keele mentaalse maailmapildi allikaid ja piirjooni. – Pühendusteos Huno Rätsepale 28.12.1997. Toim. Mati Erelt, Meeli Sedrik, Ellen Uuspõld. (= Tartu Ülikooli eesti keele õppetooli toimetised 7.) Tartu: Tartu Ülikooli Kirjastus, 255–268.