Please use this identifier to cite or link to this item: http://hdl.handle.net/10609/63587
Title: Ús d'algorismes d'aprenentatge automàtic en entorns big data per a l'obtenció de models predictius de contaminació
Author: Bonet-Vilela, Fidel  
Tutor: Isern, David  
Others: Universitat Oberta de Catalunya
Ventura, Carles  
Abstract: The goal of this project is the use of machine learning algorithms in big data environments for obtaining predictive models of air pollution. Based on historical weather, traffic and air pollution datasets from sensors distributed throughout the territory, several machine learning models have been obtained. These models have been created in a big data environment because, nowadays, the amount of data collected by sensors is very large. In order to accomplish this, firstly, Apache Hadoop clusters have been implemented in two architectures: a pseudo-distributed one, using a virtual machine, and a distributed one in the Amazon Web Services platform. Afterwards, Apache Hive has been used to load the data into an HDFS distributed file system and preprocess it. Finally, Apache Mahout has been used as a machine learning library.
Keywords: machine learning
big data
Apache Hadoop
Document type: info:eu-repo/semantics/bachelorThesis
Issue Date: 1-Jun-2017
Publication license: http://creativecommons.org/licenses/by-nc-nd/3.0/es/  
Appears in Collections:Bachelor thesis, research projects, etc.

Files in This Item:
File Description SizeFormat 
fbonetviTFG0617memòria.pdfMemòria del treball fi de grau12,26 MBAdobe PDFThumbnail
View/Open
fbonetviTFG0617presentació.pdfPresentació del treball fi de grau17,62 MBAdobe PDFThumbnail
View/Open