DOI: 10.18178/wcse.2025.06.009
Risk Statement Classifier Using BERT and DistilBERT
Abstract— Risk management plays a crucial role in decision-making within organizations, particularly in academic institutions where various risks, from financial limitations to infrastructure challenges, can significantly affect operations. Typically, onsite risk evaluation processes depend on manually sorting through categories, which can be labor-intensive, biased, and inconsistent. As universities transition to more data-driven approaches, the need for automated risk classification systems is rapidly increasing. Recent advancements in natural language processing (NLP) and machine learning (ML) have opened up new avenues for automating tasks like risk categorization. Pretrained language models such as BERT (Bidirectional Encoder Representations from Transformers) and its smaller counterpart, DistilBERT, have shown impressive results in text classification. The deep contextual embeddings provided by these models allow them to excel in understanding language, making them highly effective at generating accurate categorizations of risk statements. This research aims to develop and evaluate a machine learning model for the automatic classification of risk statements using BERT and DistilBERT. The objective is to assess how accurately these models can categorize risk statements into specific categories. The evaluation will involve measuring accuracy, recall, F1-score, and comparing the outcomes.
Index Terms— Text Classification, Machine Learning, Natural Language Processing, Risk Management
Carlo G. Inovero, Kurt Andrei Carreon
Polytechnic University of the Philippines, PHILIPPINES
Cite: Carlo G. Inovero, Kurt Andrei Carreon, "Risk Statement Classifier Using BERT and DistilBERT", 2025 the 15th International Workshop on Computer Science and Engineering (WCSE 2025), pp. 51-57, Jeju Island, South Korea, June 28-30, 2025.
