Please use this identifier to cite or link to this item: http://hdl.handle.net/10609/97627
Title: Fuzzy C-means and clustering algorithms: a comparative study
Author: García Domingo, Victor
Tutor: Nuñez Do Rio, Joan Manuel
Others: Ventura, Carles  
Abstract: Clustering is a technique that groups observations in a dataset based on the distance to the centre of the clusters. One of the first clustering algorithms was K-Means (KM), which is especially accurate at recognising well-separated clusters. Afterwards, Fuzzy C-Means (FCM) was formulated to improve the accuracy of KM with datasets containing overlapping clusters. Since then, other derivatives of FCM have been developed to improve it: Gustafson Kessel Fuzzy C-Means (GKFCM) performs better for non-spherical clusters, Fuzzy C-Means++ (FCM++) and Suppressed-Fuzzy C-Means (S-FCM) improve FCM's efficiency and Possibilistic C-Means (PCM) is more accurate for datasets with noise and outliers. In this project, I have compared KM, FCM, GKFCM, FCM++, S-FCM and PCM to check how each evolution has improved its predecessor. This comparison is centralised around FCM. I have validated parameters such as computational efficiency, performance and accuracy. I have found that, among all the algorithms, FCM has the best performance for datasets with overlapping clusters, even though S-FCM improves its computational efficiency. Also, KM is the most efficient algorithm and GKFCM performs well with non-spherical clusters. However, it is less accurate. Finally, PCM has not shown any advantage over FCM. This project is a starter point for future investigations of the conditions under which every algorithm works better. Most of the datasets used here are synthetic datasets, based on near-ideal characteristics. Nevertheless, real-world datasets are expected to have more complex structures for which the choice of algorithms require a more thorough investigation.
Keywords: clustering
Fuzzy C-Means
algorithms
Document type: info:eu-repo/semantics/bachelorThesis
Issue Date: Jun-2019
Publication license: http://creativecommons.org/licenses/by-nc-nd/3.0/es/  
Appears in Collections:Bachelor thesis, research projects, etc.

Files in This Item:
File Description SizeFormat 

vgarciadomiTFG0619video.mp4

Video of TFG49,64 MBMP4View/Open
vgarciadomiTFG0619memory.pdfMemory of TFG1,56 MBAdobe PDFThumbnail
View/Open
vgarciadomiTFG0619presentation.pdfPresentation of TFG713,65 kBAdobe PDFThumbnail
View/Open