DOI: 10.18178/wcse.2024.06.016
Performance Metrics Analysis of Machine Learning Classification Models with GloVe Word Embedding for a School-based Email Data
Abstract— The primary objective of this research is to classify school-based email correspondence into two distinct categories: General Inquiry Type and Verification Type. Email labeling is accomplished through the utilization of a rule-based technique that incorporates word embedding, as well as the integration of dictionary-based and user-defined keywords. The novelty of this study is in its dataset of 21,280 emails originating from the researcher's affiliated university. Various machine learning models were employed using GloVe word embedding. The performance metrics of the machine learning models using GloVe Word Embedding on s school-based email data set were evaluated based on accuracy, precision, recall, and F1 score. The results showed that KNN-GloVe outperformed the other machine learning models which consistently demonstrated the highest results in all metrics used.
Index Terms— Machine learning, word embedding, GloVe, email classification
Lorelyn F. Adrales
Technological Institute of the Philippines, PHILIPPINES
Notre Dame of Dadiangas University, PHILIPPINES
Ariel M. Sison
Emilio Aguinaldo College, PHILIPPINES
Cite: Lorelyn F. Adrales, Ariel M. Sison, "Performance Metrics Analysis of Machine Learning Classification Models with GloVe Word Embedding for a School-based Email Data," 2024 The 14th International Workshop on Computer Science and Engineering (WCSE 2024), pp. 103-109, Phuket Island, Thailand, June 19-21, 2024.