Please use this identifier to cite or link to this item: http://hdl.handle.net/10609/127711
Title: Técnicas de machine learning aplicadas a la búsqueda de biomarcadores de cáncer de mama
Author: Pérez Córdova, Javier
Director: Rius, Àngels  
Tutor: Iglesias Allones, Jose Luis
Abstract: Breast cancer is the most prominent cancer in the female population with a prevalence of 16% among all female cancers according to data from the World Health Organization (1). Although it is mainly linked to the ha developed world, the highest mortality rates occur in developing countries (69% of deaths (1)). On this basis, this work will proceed to the study of various databases with anthropometric values and values obtained from blood tests such as the dataset Breast Cancer Coimbra (2). During the execution, the CRISP-DM methodology (3) is applied for the whole cycle of data mining, carrying out an exhaustive study of the different variables, as well as a thorough review of the different existing machine learning techniques and those already to breast cancer screening, for the subsequent application of decision trees, random forest and gradient boosting machines to find those variables that can serve as targets in screening processes and early detection of breast cancer. Finally, a tool is provided for clinical use to help in decision-making based on the application of of the best models obtained for each algorithm, such as a random forest model with a ROC value of 79.4%, thus seeking to improve the adherence of doctors to the use of the knowledge extracted from the analysis and encouraging their confidence in the results.
Keywords: breast cancer
machine learning
data mining
Document type: info:eu-repo/semantics/masterThesis
Issue Date: Jan-2021
Publication license: http://creativecommons.org/licenses/by-nc-nd/3.0/es/  
Appears in Collections:Bachelor thesis, research projects, etc.

Files in This Item:
File Description SizeFormat 
javipercorTFM0121memoria.pdfMemoria del TFM1,09 MBAdobe PDFThumbnail
View/Open