Please use this identifier to cite or link to this item: http://hdl.handle.net/10609/127826
Title: Predicción de la toxicidad en péptidos mediante técnicas Machine Learning
Author: Monserrat Gómez, Mariano
Others: Maceira, Marc  
Sanchez-Martinez, Melchor  
Abstract: In the development of new drugs, it has been seen that the majority of the studied molecules are discarded during clinical trials due to toxicity. As a consequence, computational methods have been developed to predict the activity of candidate molecules. In the last years, peptides have been proposed as possible drug candidates due to their high biological activity, specificity, low production cost and high penetration. This project aims to develop a Machine Learning (ML) model to predict peptide toxicity for the possible development of new drugs. Toxic and non-toxic peptides will be collected from different databases and gathered in a new dataset. With the peptide sequences, the pseudo-amino acids descriptors will be generated for the creation of predictive models of peptide toxicity. New datasets will be created with clustering and balancing techniques that will be used to generate predictive models with the used ML algorithms: SVM, RF and GBRT. The best models will be selected and evaluated with an external dataset. The models generated by clustering and subsampling had higher quality (accuracy, precision, sensitivity, specificity, F1-scrore and AUC of the ROC curve) than those obtained with the initial dataset. Three of the five best models evaluated with an external dataset (Modelo Subsampling SVM, Modelo Subsampling RF, Modelo DBSCAN+Subsampling SVM, Modelo DBSCAN+Subsampling GBRT y Modelo Linclust RF) presented better quality indicators. We conclude that the best models to predict peptides toxicity are Modelo Subsampling SVM, Modelo Subsampling RF and Modelo DBSCAN+Subsampling GBRT.
Keywords: peptide
toxicity
machine learning
Document type: info:eu-repo/semantics/masterThesis
Issue Date: 5-Jan-2021
Publication license: http://creativecommons.org/licenses/by-nc-nd/3.0/es/  
Appears in Collections:Trabajos finales de carrera, trabajos de investigación, etc.

Files in This Item:
File Description SizeFormat 
mmonserratgTFM0121memoria.pdfMemoria del TFM3,47 MBAdobe PDFThumbnail
View/Open