Please use this identifier to cite or link to this item: http://hdl.handle.net/10609/151496
Title: Automatic lexical acquisition from Raw Corpora: an application to Russian
Author: Oliver, Antoni  
Castellón Masalles, Irene  
MARQUEZ, LLUIS  
Citation: Oliver, A.[Antoni ], Castellón, I. [Irene] & Màrquez , L. [Lluís].(2003). Automatic lexical acquisition from Raw Corpora: an application to Russian. A Proceedings of the 2003 EACL Workshop on Morphological Processing of Slavic Languages, (p. 17–24). Budapest, Hungria. Association for Computational Linguistic
Abstract: This paper presents a methodology for the automatic acquisition of lexical and morpho-syntactic informatio from raw corpora. The system uses information about the inclectional morphology declared by rules and is based on the co-occurrence of different forms of the same paradigm in the corpus. A direct application of this methodology gices very poor precision rates due to rule interaction between paradigms. We present a rule analysis algorithm that solves this problem, giving quite better precision rates, although recall decreases damatically. Finally, we investigat some techniques to raise the recall, achieving recall rates around 67% with a precision of 92%.
Document type: info:eu-repo/semantics/conferenceObject
Issue Date: Apr-2003
Publication license: http://creativecommons.org/licenses/by-nc-nd/3.0/es/  
Appears in Collections:Conferencias

Files in This Item:
File Description SizeFormat 
2003-AutomaticLexicalAcquisition-Oliver-Castellon-Marquez.pdf460,87 kBAdobe PDFThumbnail
View/Open
Share:
Export:
View statistics

This item is licensed under aCreative Commons License Creative Commons