An Enhanced Agglomerative Fuzzy K-Means Clustering Method with Mapreduce implementation on Hadoop Platform
摘要：In this Paper, an enhanced agglomerative fuzzy KMeans clustering algorithm with the MapR educe implementation is proposed. In this algorithm, an initial center selection method is introduced toimprove the accuracy and increase the convergence speed of the agglomerative fuzzy k-means algorithm. Then, a MapR educe implementation based on Apache Hadoop is presented toincrease the scalability for large scale datasets. Experiments were respectively conducted on a synthetic data set, the WINE dataset from UCI Repository and a randomly generated large dataset. The experimental results show that the proposed algorithm can identify true cluster number and produce accurate result with good scalability on large dataset.
2014 IEEE International Conference on Progress in Informatics and Computing