WCSE 2019 SPRING ISBN: 978-981-14-1455-8
DOI: 10.18178/wcse.2019.03.007

Impressive Approach for Documents Clustering Using Semantics Relations in Feature Extraction

Wai Wai Lwin

Abstract— The Internet or World Wide Web (WWW) is awful spread today, therefore to navigate, summarize, and organize informat ion effect ively fast and high-quality web document clustering algorithms play an important role. In this area, dimensionality reduction and semantic relat ions are also of fantastic influence step in the data mining process. For computing the document similarity, it is used vector-spacemodel that represents several features present in document. In general, it cannot account for the words (noun) such as names of the people, countries and items . as features. They almost are ignored as irrelevant attributes. But some of these irrelevant terms are valuable in specific domain. Moreover traditional feature representation is not able to reflect the semantic contents of a document because of the synonym problem and polysemy problem. Motivation of these reasons, we proposed the domain ontology which represents the semantic relations of specific terms and semantic words like lexical database. It can increase the process of extraction of features in specific documents and reduce the dimensionality. As a result, the calculation of similarity measure will be more definite, and enhancing in the segmentation between clusters. In this paper, we tested the proposed method in documents clustering area with Particle Swarm Optimization (PSO) document clustering algorithm that performs a globalized search in the entire solution space. The proposed method can support the efficient clustering approach for document clustering of PSO algorithm using semantic relation in features extraction.

Index Terms— Data Mining, Features Extraction, Semantic Relation, Ontology, PSO

Wai Wai Lwin
University of Computer Studies, MYANMAR

[Download]


Cite: Wai Wai Lwin, "Impressive Approach for Documents Clustering Using Semantics Relations in Feature Extraction," Proceedings of 2019 the 9th International Workshop on Computer Science and Engineering WCSE_2019_SPRING, pp. 35-41, Yangon, Myanmar, February 27-March 1, 2019.