Please use this identifier to cite or link to this item: http://hdl.handle.net/10609/145789
Title: Identificación de las crisis en el sistema Zújar de la subdirección de análisis de información e investigación del fraude de la AEAT
Author: Torre Madrid, Rubén de la
Tutor: Andrés Sanz, Humberto
Abstract: Machine learning project aimed at identifying crises in the Zújar system of the TAIIF department of the AEAT from the activity records generated by the applications that consume its information. The project evaluates the use of the main classification models, with the intention of selecting the model or models that obtain the best metrics when classifying records as crisis or non-crisis moments. During the project, the F1 metric is mainly evaluated, since it, in turn, weighs the precision and recall metrics, which are the most interesting for the problem of identifying the greatest number of crises. Precision measures the percentage of positives identified, and recall measures the percentage of real positives among the records classified as such. In addition, several techniques are used to improve the poor results obtained during the first stages of the modeling phase. These techniques are intended to alleviate several identified problems, mainly the lack of balance between positive and negative cases. As a result of the project, in addition to this report, a library developed in Python (duly documented) prepared to evaluate the different models used (developed utilities) is delivered, as well as a user's guide necessary for the proper use of this library.
Keywords: business intelligence
machine learning
classification models
decision trees
Document type: info:eu-repo/semantics/bachelorThesis
Issue Date: 10-Jun-2022
Publication license: http://creativecommons.org/licenses/by-nc-nd/3.0/es/  
Appears in Collections:Bachelor thesis, research projects, etc.

Files in This Item:
File Description SizeFormat 
rde_la_torremTFG0622memoria.pdfMemoria del TFG1,83 MBAdobe PDFThumbnail
View/Open