Please use this identifier to cite or link to this item:
http://hdl.handle.net/10609/82245
Title: | Conocimiento en 1000Genome y GWAS |
Author: | Nou Castell, Ramon |
Tutor: | Andrio, Pau |
Others: | Marco-Galindo, Maria-Jesús |
Abstract: | The biological knowledge, or at least a big part of it, is divided in different databases. Thanks to the advances in the computation power, we can analyse all this data using data mining, statistical methods and machine learning techniques. In this work, we will focus in two important databases that can be used to find relations between populations and fenotypes using SNPs (Single Nucleotide Polymorphism) as features. For this work, we will use information from 1000Genome, a database containing the sequentiation of more than 1000 humans' genome and from GWAS, another database that contains the relation between SNPs and traits (i.e., asthma or cancer). Different ways of extracting information will be presented, including machine learning. After that, a performance analysis and optimization techniques will be applied both to computation speed (parallelism) and I/O (data distribution). Finally, a comparative analysis of machine learning algorithms will be presented. |
Keywords: | machine learning genome SNP |
Document type: | info:eu-repo/semantics/masterThesis |
Issue Date: | Jun-2018 |
Publication license: | http://creativecommons.org/licenses/by-nc-nd/3.0/es/ |
Appears in Collections: | Trabajos finales de carrera, trabajos de investigación, etc. |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
rnouTFM0618memoria.pdf | Memoria del TFM | 1,95 MB | Adobe PDF | View/Open |
Share:
Items in repository are protected by copyright, with all rights reserved, unless otherwise indicated.