ISBN: 978-981-18-7950-0 DOI: 10.18178/wcse.2023.06.021
Comparative Analysis of Various SMS Spam Detection Methods using Machine Learning
Abstract—The term SMS (Short Message Service) refers to a popular text messaging service that is commonly used in telephone, internet, and mobile device systems. This service relies on standardized communication protocols that enable short text messages to be exchanged between mobile devices. The increase in SMS spam messages can be attributed to the higher limit of free SMS allowed by Internet Service Providers (ISPs) SMS spam detection relies heavily on the presence of known words, phrases, abbreviations, and idioms commonly used in spam messages. Studies have developed various datasets to train and test SMS spam detection models and have used different classification techniques to improve the accuracy and efficiency of these models. In the present study, various classification techniques for SMS spam detection have been explored such as Naive Bayes, Support Vector Machines (SVM), Decision Trees, Random Forest, and Neural Networks. These techniques use different approaches to identify patterns and features in the messages that distinguish spam from legitimate messages. Among the various algorithms Naïve Bayes Classifier achieved a highest accuracy of 98.44% and Matthew Correlation Coefficients value of 0.93 for the dataset.
Index Terms—SMS, Spam Detection, Machine Learning, Legitimate Messages
Kartik Ahluwalia, Gururaj H L, Rashmi R
Manipal Institute of Technology Bengaluru, Manipal Academy of Higher Education, INDIA
University of Houston, Downtown, USA
Cite: Kartik Ahluwalia, Gururaj H L, Rashmi R, Hong Lin, "Comparative Analysis of Various SMS Spam Detection Methods using Machine Learning" Proceedings of 2023 the 13th International Workshop on Computer Science and Engineering (WCSE 2023), pp. 146-155, June 16-18, 2023.