Please use this identifier to cite or link to this item:
Title: Análisis de la Encuesta de Salud Nacional y Examen de Nutrición de Estados Unidos (NHANES) usando machine learning
Author: Crespo Estévez, María José
Director: Subirats Maté, Laia
Tutor: Casas Roma, Jordi
Keywords: medicine
machine learning
Issue Date: Jun-2019
Publisher: Universitat Oberta de Catalunya (UOC)
Abstract: We are going to work with the kaggle¿s dataset named National Health and Nutrition Examination Survey in this paper. The main purpose is to design and implement different unsupervised models to identify patterns, to discover how the data tends to group and if there are comorbidities among the diseases. We are also going to design predictive models to detect if a patient suffers from some of the diseases written in the dataset. In the clustering models, we choose the parameter n_neighbors with the elbow method and the parameters of the predictive models with the RandomizedSearchCV and then with GridSearchCV. A clustering model with k-Means is implemented for the total data set and another for diseases of the medications file. In the first, it is concluded that age and variables related to dental health are the most important for the determination of clusters, in the second, possible comorbidities for diseases are obtained. For predictive models the algorithms are used: Support Vector Classification, Gradient Boosting Classifier, AdaBoost Classifier, Random Forest Classifier, Naive Bayes, Logistic Regression and k-NN from sklearn. The best model is obtained with AdaBoost and an accuracy of 76.33, although the Naive Bayes offers a good result of the TPR of 62.69 to obtain the lowest amount of false negatives among all models.
Language: Spanish
Appears in Collections:Bachelor thesis, research projects, etc.

Files in This Item:
File Description SizeFormat 
marcreestTFM0619presentacion.pptxPresentación en pptx1.57 MBMicrosoft Powerpoint XMLView/Open
marcreestTFM0619memoria.pdfMemoria del TFM1.06 MBAdobe PDFView/Open
marcreestTFM0619presentación.pdfPresentación del TFM1.97 MBAdobe PDFView/Open

This item is licensed under a Creative Commons License Creative Commons