Please use this identifier to cite or link to this item:
Title: TBXTools: A free, fast and flexible tool for automatic terminology extraction
Author: Oliver González, Antoni
Vàzquez Garcia, Mercè  
Keywords: Terminology extraction
Computational linguistics
Issue Date: 15-Sep-2015
Publisher: Association for Computational Linguistics (ACL)
Abstract: The manual identification of terminology from specialized corpora is a complex task that needs to be addressed by flexible tools, in order to facilitate the construction of multilingual terminologies which are the main resources for computer-assisted translation tools, machine translation or ontologies. The automatic terminology extraction tools developed so far either use a proprietary code or an open source code, that is limited to certain software functionalities. To automatically extract terms from specialized corpora for different purposes such as constructing dictionaries, thesauruses or translation memories, we need open source tools to easily integrate new functionalities to improve term selection. This paper presents TBXTools, a free automatic terminology extraction tool that implements linguistic and statistical methods for multiword term extraction. The tool allows the users to easily identify multiword terms from specialized corpora and also, if needed, translation candidates from parallel corpora. In this paper we present the main features of TBXTools along with evaluation results for term extraction, both using statistical and linguistic methodology, for several corpora.
Language: English
ISSN: 1313-8502MIAR
Appears in Collections:Research papers

Files in This Item:
File Description SizeFormat 
TBXTools-A free, fast and flexible tool for automatic terminology extraction.pdfArticle134.62 kBAdobe PDFView/Open

This item is licensed under a Creative Commons License Creative Commons