Please use this identifier to cite or link to this item:
http://hdl.handle.net/10609/46021
Title: | TBXTools: A free, fast and flexible tool for automatic terminology extraction |
Author: | Oliver, Antoni Vàzquez, Mercè |
Abstract: | The manual identification of terminology from specialized corpora is a complex task that needs to be addressed by flexible tools, in order to facilitate the construction of multilingual terminologies which are the main resources for computer-assisted translation tools, machine translation or ontologies. The automatic terminology extraction tools developed so far either use a proprietary code or an open source code, that is limited to certain software functionalities. To automatically extract terms from specialized corpora for different purposes such as constructing dictionaries, thesauruses or translation memories, we need open source tools to easily integrate new functionalities to improve term selection. This paper presents TBXTools, a free automatic terminology extraction tool that implements linguistic and statistical methods for multiword term extraction. The tool allows the users to easily identify multiword terms from specialized corpora and also, if needed, translation candidates from parallel corpora. In this paper we present the main features of TBXTools along with evaluation results for term extraction, both using statistical and linguistic methodology, for several corpora. |
Keywords: | Terminology extraction Computational linguistics |
Document type: | info:eu-repo/semantics/article |
Issue Date: | 15-Sep-2015 |
Publication license: | http://creativecommons.org/licenses/by-nc-nd/3.0/es/ |
Appears in Collections: | Articles cientÍfics Treballs, papers de recerca |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
TBXTools-A free, fast and flexible tool for automatic terminology extraction.pdf | Article | 134,62 kB | Adobe PDF | View/Open |
Share:
This item is licensed under a Creative Commons License