Please use this identifier to cite or link to this item: http://hdl.handle.net/10609/46021
Title: TBXTools: A free, fast and flexible tool for automatic terminology extraction
Author: Oliver González, Antoni
Vàzquez Garcia, Mercè
Abstract: The manual identification of terminology from specialized corpora is a complex task that needs to be addressed by flexible tools, in order to facilitate the construction of multilingual terminologies which are the main resources for computer-assisted translation tools, machine translation or ontologies. The automatic terminology extraction tools developed so far either use a proprietary code or an open source code, that is limited to certain software functionalities. To automatically extract terms from specialized corpora for different purposes such as constructing dictionaries, thesauruses or translation memories, we need open source tools to easily integrate new functionalities to improve term selection. This paper presents TBXTools, a free automatic terminology extraction tool that implements linguistic and statistical methods for multiword term extraction. The tool allows the users to easily identify multiword terms from specialized corpora and also, if needed, translation candidates from parallel corpora. In this paper we present the main features of TBXTools along with evaluation results for term extraction, both using statistical and linguistic methodology, for several corpora.
Keywords: Terminology extraction
Computational linguistics
Document type: info:eu-repo/semantics/article
Issue Date: 15-Sep-2015
Publication license: http://creativecommons.org/licenses/by-nc-nd/3.0/es/  
Appears in Collections:Articles cientÍfics
Treballs, papers de recerca

Files in This Item:
File Description SizeFormat 
TBXTools-A free, fast and flexible tool for automatic terminology extraction.pdfArticle134,62 kBAdobe PDFThumbnail
View/Open