Accurate Algorithm for Unsupervised Learning in Large Data Sets

Unsupervised learning or clustering in large data sets is a challenging problem. Most clustering algorithms are not efficient and accurate in such data sets. Therefore development of clustering algorithms capable of solving clustering problems in large data sets is very important. In this paper, we consider one such accurate algorithm and test it using large data sets. Our algorithm is based on the nonsmooth optimization formulation of the clustering problem. In this problem we use the squared Euclidean norm to define the similarity measure and apply the difference of convex representation of the clustering function. Keywords - Cluster Analysis; Data Mining; Algorithms.