Please use this identifier to cite or link to this item: http://hdl.handle.net/10609/73550
Title: Minería de datos sobre los factores de riesgo del cáncer de mama
Author: Carretero Palacios, Guillermo
Tutor: Rebrij, Romina  
Others: Universitat Oberta de Catalunya
Merino, David  
Abstract: In this study we have listed and classified the risk factors that can induce an occurrence of breast cancer. These factors are divided into two large groups, genetic risk factors and environmental risk factors. The aim of this study is to observe the interaction between groups and intragroups. To carry out the study, data mining technique was used, specifically text mining. This tool allows us to analyze large volumes of data in a fast and efficient way. Our database was PubMed, which provided us the results of medical articles necessary for the study. To analyze this data, PubMed.mineR and R package were used. These programs are specifically for the analysis of PubMed texts. After applying this methodology, several results were obtained. First, we obtained a list of 4 genes, which are mutations that have high probabilities of producing breast cancer, two main genes are BRCA1 and BRCA2. The results were more dispersed as there were a lot of possibilities, but the most influential factors were age and overexposure or estrogens production. Another component with great weight was the hereditary factor. In conclusion, there are factors with a great weight in the possible appearance of breast cancer, keeping a close relationship between genetic and environmental factors.
Keywords: breast cancer
risk factor
text mining
Document type: info:eu-repo/semantics/masterThesis
Issue Date: 29-Jan-2018
Publication license: http://creativecommons.org/licenses/by-nc-nd/3.0/es/  
Appears in Collections:Trabajos finales de carrera, trabajos de investigación, etc.

Files in This Item:
File Description SizeFormat 
gcarreteropTFM0118memoria.pdfMemoria del TFM779,4 kBAdobe PDFThumbnail
View/Open