WCSE 2016
ISBN: 978-981-11-0008-6 DOI: 10.18178/wcse.2016.06.125

Analysis of Automatic Clustering of Textual Information Using by Mountain Clustering Method

Kyaw Zaw Ye, Fedorov A. R., Shiryaev A. P., Gagarina L. G., Yanakova E. S.

Abstract— Methods of searching and categorizing information become very important in view of the growing volumes of unstructured information in the Internet. One of such methods is cluster analysis, which is presented by a variety of algorithms. This article is devoted to the research and development of heuristic methodic to perform automatic clustering of text information effectively. The proposed solution uses a subtractive clustering algorithm to determine the number of clusters and performs splitting of articles into clusters using an algorithm of the k-means family. A computational experiment by a collection of articles from the web resource Wikipedia.ru demonstrates that this method using the algorithms k-means and kmedoids is suitable for automatic clustering of textual information, but it requires further improvements.

Index Terms— clustering, cluster analysis, k-means, k-medoids, mountain clustering.

Kyaw Zaw Ye, Fedorov A. R., Shiryaev A. P., Gagarina L. G., Yanakova E. S.
Russian Federation, National Research University of Electronic Technology, RUSSIA

[Download]


Cite: Kyaw Zaw Ye, Fedorov A. R., Shiryaev A. P., Gagarina L. G., Yanakova E. S., "Analysis of Automatic Clustering of Textual Information Using by Mountain Clustering Method," Proceedings of 2016 6th International Workshop on Computer Science and Engineering, pp. 706-709, Tokyo, 17-19 June, 2016.