Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10609/109799
Registro completo de metadatos
Campo DC Valor Lengua/Idioma
dc.contributor.authorFarrús, Mireia-
dc.contributor.authorCosta Jussà, Marta R.-
dc.contributor.authorMariño, José B.-
dc.contributor.authorPoch Riera, Marc-
dc.contributor.authorHernández Huerta, Adolfo-
dc.contributor.authorHenríquez, Carlos-
dc.contributor.authorRodríguez Fonollosa, José A.-
dc.contributor.otherUniversitat Oberta de Catalunya. Internet Interdisciplinary Institute (IN3)-
dc.contributor.otherUniversitat Politècnica de Catalunya (UPC)-
dc.date.accessioned2020-02-18T08:23:45Z-
dc.date.available2020-02-18T08:23:45Z-
dc.date.issued2011-02-20-
dc.identifier.citationFarrús Cabeceran, M., Costa-Jussà, M.R., Marino, J.B., Poch, M., Hernandez, A., Henriquez, C. & Rodriguez Fonollosa, J.A. (2011). Overcoming statistical machine translation limitations: error analysis and proposed solutions for the Catalan-Spanish language pair. Language Resources and Evaluation, 45(2), 181-208. doi: 10.1007/s10579-011-9137-0es
dc.identifier.issn1574-020XMIAR
-
dc.identifier.urihttp://hdl.handle.net/10609/109799-
dc.description.abstractThis work aims to improve an N-gram-based statistical machine translation system between the Catalan and Spanish languages, trained with an aligned Spanish-Catalan parallel corpus consisting of 1.7 million sentences taken from El Periódico newspaper. Starting from a linguistic error analysis above this baseline system, orthographic, morphological, lexical, semantic and syntactic problems are approached using a set of techniques. The proposed solutions include the development and application of additional statistical techniques, text pre- and post-processing tasks, and rules based on the use of grammatical categories, as well as lexical categorization. The performance of the improved system is clearly increased, as is shown in both human and automatic evaluations of the system, with a gain of about 1.1 points BLEU observed in the Spanish-to-Catalan direction of translation, and a gain of about 0.5 points in the reverse direction. The final system is freely available online as a linguistic resource.en
dc.formatAR/0000002624-
dc.format.mimetypeapplication/pdf-
dc.language.isoeng-
dc.publisherLanguage Resources and Evaluation-
dc.relation.ispartofLanguage Resources and Evaluation, 2011, 45(2)-
dc.relation.urihttps://doi.org/10.1007/s10579-011-9137-0-
dc.rightsCC BY-NC-ND-
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/es/-
dc.subjectstatistical machine translationen
dc.subjectn-gram-based translationen
dc.subjectlinguistic knowledgeen
dc.subjectgrammatical categoriesen
dc.subjecttraducción automática estadísticaes
dc.subjecttraducció automàtica estadísticaca
dc.subjecttraducció basada en n-gramesca
dc.subjecttraducción basada en n-gramases
dc.subjectconeixements lingüísticsca
dc.subjectconocimientos lingüísticoses
dc.subjectcategories gramaticalsca
dc.subjectcategorías gramaticaleses
dc.subject.lcshMachine translatingen
dc.titleOvercoming statistical machine translation limitations: error analysis and proposed solutions for the Catalan-Spanish language pair-
dc.typeinfo:eu-repo/semantics/article-
dc.subject.lemacTraducció automàticaca
dc.subject.lcshesTraducción automáticaes
dc.rights.accessRightsinfo:eu-repo/semantics/openAccess-
dc.identifier.doi10.1007/s10579-011-9137-0-
dc.gir.idAR/0000002624-
dc.relation.projectIDinfo:eu-repo/grantAgreement/TEC2009-14094-C04-01-
dc.type.versioninfo:eu-repo/semantics/acceptedVersion-
Aparece en las colecciones: Articles
Articles cientÍfics

Ficheros en este ítem:
Fichero Descripción Tamaño Formato  
Farrus_LRE_Overcoming.pdf367,31 kBAdobe PDFVista previa
Visualizar/Abrir