Please use this identifier to cite or link to this item: http://hdl.handle.net/10609/150130
Full metadata record
DC FieldValueLanguage
dc.contributor.authorGiner Miguelez, Joan-
dc.contributor.authorGómez, Abel-
dc.contributor.authorCabot, Jordi-
dc.date.accessioned2024-04-03T08:46:45Z-
dc.date.available2024-04-03T08:46:45Z-
dc.date.issued2024-01-02-
dc.identifier.citationGiner-Miguelez, J. [Joan], Gómez, A. [Abel] & Cabot, J. [Jordi]. (2024). DescribeML: A dataset description tool for machine learning. Science of Computer Programming, 231, 103030. doi: 10.1016/j.scico.2023.103030-
dc.identifier.issn0167-6423MIAR
-
dc.identifier.urihttp://hdl.handle.net/10609/150130-
dc.description.abstractDatasets are essential for training and evaluating machine learning models. However, they are also the root cause of many undesirable model behaviors, such as biased predictions. To address this issue, the machine learning community is proposing as a best practice the adoption of common guidelines for describing datasets. However, these guidelines are based on natural language descriptions of the dataset, hampering the automatic computation and analysis of such descriptions. To overcome this situation, we present DescribeML, a language engineering tool to precisely describe machine learning datasets in terms of their composition, provenance, and social concerns in a structured format. The tool is implemented as a Visual Studio Code extension.en
dc.format.mimetypeapplication/pdf-
dc.language.isoeng-
dc.publisherElsevier BV-
dc.relation.ispartofScience of Computer Programming, 2024, 231(103030)-
dc.relation.urihttps://doi.org/10.1016/j.scico.2023.103030-
dc.rightsCC BY-NC-ND-
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/4.0/-
dc.subjectdatasetsen
dc.subjectmachine learningen
dc.subjectmodel-driven engineeringen
dc.subjectfairnessen
dc.subjectdomain-specific languagesen
dc.titleDescribeML: A dataset description tool for machine learningen
dc.typeinfo:eu-repo/semantics/article-
dc.rights.accessRightsinfo:eu-repo/semantics/openAccess-
dc.identifier.doihttps://doi.org/10.1016/j.scico.2023.103030-
dc.type.versioninfo:eu-repo/semantics/publishedVersion-
Appears in Collections:Articles cientÍfics
Articles

Files in This Item:
File Description SizeFormat 
DescribeML_A_dataset_description_tool_for_machine_learning.pdf919,38 kBAdobe PDFThumbnail
View/Open
Share:
Export:
View statistics

Items in repository are protected by copyright, with all rights reserved, unless otherwise indicated.