Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10609/150130
Título : DescribeML: A dataset description tool for machine learning
Autoría: Giner Miguelez, Joan  
Gómez, Abel  
Cabot, Jordi  
Citación : Giner-Miguelez, J. [Joan], Gómez, A. [Abel] & Cabot, J. [Jordi]. (2024). DescribeML: A dataset description tool for machine learning. Science of Computer Programming, 231, 103030. doi: 10.1016/j.scico.2023.103030
Resumen : Datasets are essential for training and evaluating machine learning models. However, they are also the root cause of many undesirable model behaviors, such as biased predictions. To address this issue, the machine learning community is proposing as a best practice the adoption of common guidelines for describing datasets. However, these guidelines are based on natural language descriptions of the dataset, hampering the automatic computation and analysis of such descriptions. To overcome this situation, we present DescribeML, a language engineering tool to precisely describe machine learning datasets in terms of their composition, provenance, and social concerns in a structured format. The tool is implemented as a Visual Studio Code extension.
Palabras clave : datasets
machine learning
model-driven engineering
fairness
domain-specific languages
DOI: https://doi.org/10.1016/j.scico.2023.103030
Tipo de documento: info:eu-repo/semantics/article
Versión del documento: info:eu-repo/semantics/publishedVersion
Fecha de publicación : 2-ene-2024
Licencia de publicación: https://creativecommons.org/licenses/by-nc-nd/4.0/  
Aparece en las colecciones: Articles cientÍfics
Articles

Ficheros en este ítem:
Fichero Descripción Tamaño Formato  
DescribeML_A_dataset_description_tool_for_machine_learning.pdf919,38 kBAdobe PDFVista previa
Visualizar/Abrir
Comparte:
Exporta:
Consulta las estadísticas

Los ítems del Repositorio están protegidos por copyright, con todos los derechos reservados, a menos que se indique lo contrario.