Please use this identifier to cite or link to this item: http://hdl.handle.net/10609/82245
Title: Conocimiento en 1000Genome y GWAS
Author: Nou Castell, Ramon
Tutor: Andrio, Pau  
Others: Marco-Galindo, Maria-Jesús  
Abstract: The biological knowledge, or at least a big part of it, is divided in different databases. Thanks to the advances in the computation power, we can analyse all this data using data mining, statistical methods and machine learning techniques. In this work, we will focus in two important databases that can be used to find relations between populations and fenotypes using SNPs (Single Nucleotide Polymorphism) as features. For this work, we will use information from 1000Genome, a database containing the sequentiation of more than 1000 humans' genome and from GWAS, another database that contains the relation between SNPs and traits (i.e., asthma or cancer). Different ways of extracting information will be presented, including machine learning. After that, a performance analysis and optimization techniques will be applied both to computation speed (parallelism) and I/O (data distribution). Finally, a comparative analysis of machine learning algorithms will be presented.
Keywords: machine learning
genome
SNP
Document type: info:eu-repo/semantics/masterThesis
Issue Date: Jun-2018
Publication license: http://creativecommons.org/licenses/by-nc-nd/3.0/es/  
Appears in Collections:Trabajos finales de carrera, trabajos de investigación, etc.

Files in This Item:
File Description SizeFormat 
rnouTFM0618memoria.pdfMemoria del TFM1,95 MBAdobe PDFThumbnail
View/Open
Share:
Export:
View statistics

Items in repository are protected by copyright, with all rights reserved, unless otherwise indicated.