چکیده
|
Data clustering is one of the most important and fundamental tasks of machine learning. Data clustering aims at dividing a set of objects into several groups according to their similarities. In recent years Density Peaks Clustering (DPC) was introduced as a fast and non-iterative clustering method which does not require any previous knowledge about the number of clusters. However, this method suffers from a few shortcomings such as its sensitivity to the user-adjustable parameter, disability to consider data distribution, and inappropriate center selection when facing complex clusters. To overcome these issues, in this paper, a novel density-based peaks clustering method called GDPCS is proposed. By employing the properties of the mutual neighborhood graph and shortest path distance, the proposed method considers the data distribution, present a better shape of clusters, and reduces the clusters' connectivity. To demonstrate the proposed method's effectiveness and superiority, many experiments were performed on both real-world and synthetic datasets. The obtained results show that the proposed method has achieved an acceptable result on imbalanced and complex shaped clusters and can detect more appropriate centers.
|