Please use this identifier to cite or link to this item: http://hdl.handle.net/10609/133102
Title: Desarrollo de un sistema de Machine Learning para obtener modelos de unión a factores de transcripción en datos ChIP-seq
Author: Álvarez González, Sara
Tutor: Erill, Ivan  
Others: Maceira, Marc  
Abstract: The aim of this work is to obtain a predictive model capable of identifying the regions where a Transcription Factor (TF) will bind to DNA. The data used are extracted from the results of the ChIP-seq technique. This technique is able to recognize the sequences in which these TFs have been coupled. Identifying the precise sequence to which the TFs are attached is a difficult task under certain molecular conditions. Computational techniques based on Machine Learning (ML), a branch within Artificial Intelligence that focuses its efforts on the development of predictive models is considered a useful analytical tool for this type of problems. These techniques are capable of extracting nonlinear patterns from data from a large set of examples. In addition, previous work has developed mathematical descriptors capable of converting the primary DNA sequence into numerical data matrices, greatly facilitating the use of ML algorithms. In this project we present a set of models that have been trained for the prediction of TF Gcra binding regions in the bacterial species Brevundimonas subvibrioides. The results presented here show a high performance in the prediction of these regions thanks to the use of both structural and DNA composition descriptors.
Keywords: protein binding
machine learning
transcription factor
Document type: info:eu-repo/semantics/masterThesis
Issue Date: Jun-2021
Publication license: http://creativecommons.org/licenses/by-nc-nd/3.0/es/  
Appears in Collections:Trabajos finales de carrera, trabajos de investigación, etc.

Files in This Item:
File Description SizeFormat 
salvarezgonzTFM0621memoria.pdfMemoria del TFM1,8 MBAdobe PDFThumbnail
View/Open