ISBN: 978-981-18-5852-9 DOI: 10.18178/wcse.2022.04.012
Hybrid Feature Selection Technique with an Enhanced TF-IDF and SVM-RFE in Sentiment Classification Based on Hotel Review
Abstract— A hybrid feature selection technique is implemented on a hotel review dataset to observe the sentiment classification performance is presented in this paper. This technique is proposed to overcome the issue of unable to accurately calculate the feature importance in existing techniques. Therefore, an enhanced Term Frequency-Inverse Document Frequency (TF-IDF) is proposed by having variance threshold while reducing features and to avoid non-significant features or significant features from being removed. Subsequently, a hybridization of the TF-IDF and Supports Vector Machine (SVM-RFE) known as TFIDF+ SVM-RFE is implemented. The TF-IDF+SVM-RFE aims to measure features importance by selecting the significant features to be classified. Hotel Review dataset from Kaggle database is used to observe the classification performance of this proposed technique. The classification performance is observed based on accuracy, precision, recall and F-measure. Based on the experiment, the proposed technique able to be outperformed other related technique with 91.74%, 91.51%, 91.92% and 91.70% for the accuracy, precision, recall and F-measure respectively. Ultimately, the proposed technique able to reduce 38.53% of the total features from 17896 to 11000 that being used in the classification. This reduction rate is significant in optimally utilizing the computational resources and maintaining the efficiency of the classification performance.
Index Terms— Sentiment Classification, Sentiment Analysis, Feature Selection, Computational Intelligence.
Center of Excellence for Artificial Intelligence & Data Science, Universiti Malaysia Pahang, Malaysia; Faculty of Computing, Universiti Malaysia Pahang, Malaysia
Nur Syafiqah Mohd Nafis
Faculty of Computing, Universiti Malaysia Pahang, Malaysia
Cite: Suryanti Awang, Nur Syafiqah Mohd Nafis, "Hybrid Feature Selection Technique with an Enhanced TF-IDF and SVM-RFE in Sentiment Classification Based on Hotel Review," WCSE 2022 Spring Event: 2022 9th International Conference on Industrial Engineering and Applications, pp. 96-104, Sanya, China, April 15-18, 2022.