Please use this identifier to cite or link to this item:
http://hdl.handle.net/10609/80325
Title: | Improving term candidates selection using terminological tokens |
Author: | Vàzquez Garcia, Mercè Oliver González, Antoni |
Citation: | Vàzquez, M.; Oliver, A. (2018). "Improving term candidates selection using terminological tokens". Terminology. International Journal of Theoretical and Applied Issues in Specialized Communication, p. 122-147. ISSN 0929-9971. DOI: 10.1075/term.00016.vaz |
Abstract: | The identification of reliable terms from domain-specific corpora using computational methods is a task that has to be validated manually by specialists, which is a highly time-consuming activity. To reduce this effort and improve term candidate selection, we implemented the Token Slot Recognition method, a filtering method based on terminological tokens which is used to rank extracted term candidates from domain-specific corpora. This paper presents the implementation of the term candidates filtering method we developed in linguistic and statistical approaches applied for automatic term extraction using several domain-specific corpora in different languages. We observed that the filtering method outperforms term candidate selection by ranking a higher number of terms at the top of the term candidate list than raw frequency, and for statistical term extraction the improvement is between 15% and 25% both in precision and recall. Our analyses further revealed a reduction in the number of term candidates to be validated manually by specialists. In conclusion, the number of term candidates extracted automatically from domain-specific corpora has been reduced significantly using the Token Slot Recognition filtering method, so term candidates can be easily and quickly validated by specialists. |
Keywords: | automatic term extraction terminological tokens TSR filtering method terminology extraction domain-specific corpora terminological units TBXTools term candidates |
DOI: | 10.1075/term.00016.vaz |
Document type: | info:eu-repo/semantics/article |
Issue Date: | 11-Jun-2018 |
Publication license: | https://creativecommons.org/licenses/by-nc/4.0/ |
Appears in Collections: | Articles Articles cientÍfics |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Vàzquez-Oliver_Improving term candidates selection using-terminological-tokens.pdf | 520,38 kB | Adobe PDF | View/Open |
Share:
This item is licensed under a Creative Commons License