Differential Evolution for Large-Scale Clustering

WCSE 2019 SPRING ISBN: 978-981-14-1455-8
DOI: 10.18178/wcse.2019.03.010

Pyae Pyae Win Cho, Thi Thi Soe Nyunt, Thet Thet Aung

Abstract— Clustering is the task of organizing data instances into groups based on the similarity between them. It plays an essential role in knowledge discovery and data mining, ranging from preprocessing step to the final goal of the task. Evolutionary algorithms (EAs) based clustering methods have developed with the intention of enhancing the effectiveness and accurateness of clustering. The huge amount of data emerging by the progress of technology have become data clustering as a challenging task and as an attractive attention for the use of EAs based approaches. Differential Evolut ion (DE), an instance of EAs, has been exploited to discover the best solution for clustering problems. It has become a successful solution to produce more compact clusters than other traditional clustering techniques. This paper presents a parallel differential evolution algorithm on Spark framework to facilitate huge amount of data clustering. Experimentations were conducted on some frequently used UCI machine learning datasets. The results have presented that the proposed approach is effective and comparable to existing algorithms.

Index Terms— Differential Evolution, Clustering, Apache Spark.

Pyae Pyae Win Cho, Thi Thi Soe Nyunt, Thet Thet Aung
University of Computer Studies, MYANMAR

[Download]

Cite: Pyae Pyae Win Cho, Thi Thi Soe Nyunt, Thet Thet Aung, "Differential Evolution for Large-Scale Clustering," Proceedings of 2019 the 9th International Workshop on Computer Science and Engineering WCSE_2019_SPRING, pp. 58-62, Yangon, Myanmar, February 27-March 1, 2019.

PREVIOUS PAPER
Compact and Robust Audio Fingerprinting for Speedy Music Identification

NEXT PAPER
Quantitative Analysis of Terrorist Attack Data Based on Weighted Clustering