Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10609/93058
Registro completo de metadatos
Campo DC Valor Lengua/Idioma
dc.contributor.authorLastra Díaz, Juan José-
dc.contributor.authorGarcía Serrano, Ana-
dc.contributor.authorBatet, Montserrat-
dc.contributor.authorFernández, Miriam-
dc.contributor.authorChirigati, Fernando-
dc.contributor.otherUniversidad Nacional de Educación a Distancia-
dc.contributor.otherOpen University-
dc.contributor.otherNew York University-
dc.date.accessioned2019-04-11T07:54:00Z-
dc.date.available2019-04-11T07:54:00Z-
dc.date.issued2017-02-21-
dc.identifier.citationLastra Díaz, J.J., García Serrano, A., Batet Sanromà, M., Fernández, M. & Chirigati, F. (2017). HESML: A scalable ontology-based semantic similarity measures library with a set of reproducible experiments and a replication dataset. Information Systems, 66(), 97-118. doi: 10.1016/j.is.2017.02.002en
dc.identifier.issn0306-4379MIAR
-
dc.identifier.urihttp://hdl.handle.net/10609/93058-
dc.description.abstractThis work is a detailed companion reproducibility paper of the methods and experiments proposed by Lastra-Díaz and García-Serrano in (2015, 2016) [56-58], which introduces the following contributions: (1) a new and efficient representation model for taxonomies, called PosetHERep, which is an adaptation of the half-edge data structure commonly used to represent discrete manifolds and planar graphs; (2) a new Java software library called the Half-Edge Semantic Measures Library (HESML) based on PosetHERep, which implements most ontology-based semantic similarity measures and Information Content (IC) models reported in the literature; (3) a set of reproducible experiments on word similarity based on HESML and ReproZip with the aim of exactly reproducing the experimental surveys in the three aforementioned works; (4) a replication framework and dataset, called WNSimRep v1, whose aim is to assist the exact replication of most methods reported in the literature; and finally, (5) a set of scalability and performance benchmarks for semantic measures libraries. PosetHERep and HESML are motivated by several drawbacks in the current semantic measures libraries, especially the performance and scalability, as well as the evaluation of new methods and the replication of most previous methods. The reproducible experiments introduced herein are encouraged by the lack of a set of large, self-contained and easily reproducible experiments with the aim of replicating and confirming previously reported results. Likewise, the WNSimRep v1 dataset is motivated by the discovery of several contradictory results and difficulties in reproducing previously reported methods and experiments. PosetHERep proposes a memory-efficient representation for taxonomies which linearly scales with the size of the taxonomy and provides an efficient implementation of most taxonomy-based algorithms used by the semantic measures and IC models, whilst HESML provides an open framework to aid research into the area by providing a simpler and more efficient software architecture than the current software libraries. Finally, we prove the outperformance of HESML on the state-of-the-art libraries, as well as the possibility of significantly improving their performance and scalability without caching using PosetHERep.en
dc.language.isoeng-
dc.publisherInformation Systems-
dc.relation.ispartofInformation Systems, 2017, 6-
dc.relation.urihttps://doi.org/10.1016/j.is.2017.02.002-
dc.rightsCC BY-NC-ND-
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/es/-
dc.subjectintrinsic and corpus-based Informationen
dc.subjectHESMLca
dc.subjectHESMLes
dc.subjectHESMLen
dc.subjectPosetHERepca
dc.subjectPosetHERepes
dc.subjectPosetHERepen
dc.subjectmedidas semánticas bibliotecariases
dc.subjectmesures semàntiques bibliotecàriesca
dc.subjectontology-based semantic similarityen
dc.subjectmeasuresen
dc.subjectmedidases
dc.subjectmesuresca
dc.subjectcontent modelsen
dc.subjectmodelos de contenidoes
dc.subjectmodels de contingutca
dc.subjectsimilarityen
dc.subjectsimilitudes
dc.subjectsimilitutca
dc.subjectReproZipca
dc.subjectReproZipes
dc.subjectReproZipen
dc.subjectWNSimRep v1 datasetca
dc.subjectWNSimRep v1 datasetes
dc.subjectWNSimRep v1 dataseten
dc.subjectreproducible experiments on worden
dc.subjectexperimentos reproducibles con palabrases
dc.subjectexperiments reproduïbles amb paraulesca
dc.subjectWordNet-based semantic similarityen
dc.subjectWordNet-basado en similitud semánticaes
dc.subjectWordNet-basat en similitud semànticaca
dc.subjectinformación intrínseca basada en corpuses
dc.subjectinformació intrínseca basada en corpusca
dc.subject.lcshOntologies (Information retrieval)en
dc.titleHESML: A scalable ontology-based semantic similarity measures library with a set of reproducible experiments and a replication dataset-
dc.typeinfo:eu-repo/semantics/article-
dc.subject.lemacOntologies (Informàtica)ca
dc.subject.lcshesOntologías (Informática)es
dc.identifier.doi10.1016/j.is.2017.02.002-
dc.gir.idAR/0000005524-
dc.relation.projectIDinfo:eu-repo/grantAgreement/TIN2015-71785-R-
dc.relation.projectIDinfo:eu-repo/grantAgreement/S2015/HUM3494-
dc.type.versioninfo:eu-repo/semantics/publishedVersion-
Aparece en las colecciones: Articles cientÍfics
Articles

Ficheros en este ítem:
Fichero Descripción Tamaño Formato  
hesml.pdf1,48 MBAdobe PDFVista previa
Visualizar/Abrir