Empreu aquest identificador per citar o enllaçar aquest ítem: http://hdl.handle.net/10609/150130
Títol: DescribeML: A dataset description tool for machine learning
Autoria: Giner Miguelez, Joan  
Gómez, Abel  
Cabot, Jordi  
Citació: Giner-Miguelez, J. [Joan], Gómez, A. [Abel] & Cabot, J. [Jordi]. (2024). DescribeML: A dataset description tool for machine learning. Science of Computer Programming, 231, 103030. doi: 10.1016/j.scico.2023.103030
Resum: Datasets are essential for training and evaluating machine learning models. However, they are also the root cause of many undesirable model behaviors, such as biased predictions. To address this issue, the machine learning community is proposing as a best practice the adoption of common guidelines for describing datasets. However, these guidelines are based on natural language descriptions of the dataset, hampering the automatic computation and analysis of such descriptions. To overcome this situation, we present DescribeML, a language engineering tool to precisely describe machine learning datasets in terms of their composition, provenance, and social concerns in a structured format. The tool is implemented as a Visual Studio Code extension.
Paraules clau: datasets
machine learning
model-driven engineering
fairness
domain-specific languages
DOI: https://doi.org/10.1016/j.scico.2023.103030
Tipus de document: info:eu-repo/semantics/article
Versió del document: info:eu-repo/semantics/publishedVersion
Data de publicació: 2-gen-2024
Llicència de publicació: https://creativecommons.org/licenses/by-nc-nd/4.0/  
Apareix a les col·leccions:Articles cientÍfics
Articles

Arxius per aquest ítem:
Arxiu Descripció MidaFormat 
DescribeML_A_dataset_description_tool_for_machine_learning.pdf919,38 kBAdobe PDFThumbnail
Veure/Obrir
Comparteix:
Exporta:
Consulta les estadístiques

Els ítems del Repositori es troben protegits per copyright, amb tots els drets reservats, sempre i quan no s’indiqui el contrari.