Joint Word Segmentation and Stemming with Neural Sequence Labeling for Myanmar Language
Abstract— Word segmentation is widely-studies sequence labeling problem using machine learning method like conditional random fields. In word segmentation, deep learning approaches have achieved state -of-theart performance. Normally, segmentation is considered as a separate process from stemming. Our approach proposes a joint model that has stronger capabilities for Myanmar word segmentation and stemming. As far as we know, this is the first work on joint Myanmar word segmentation and stemming. In this paper, we evaluate the performance of neural network architecture that relies on two sources of information about syllable- and character-level representation, by using LSTM, CNN, GRU and CRF. For the comparison and analysis process, we examine the importance of different network designs and different factors such as the last layer of the network and different optimizers.
Index Terms— Myanmar word segmentation, Stemming, joint model, neural networks .
Yadanar Oo, Khin Mar Soe
University of Computer Studies, Yangon, Myanmar
Cite: Yadanar Oo, Khin Mar Soe, "Joint Word Segmentation and Stemming with Neural Sequence Labeling for Myanmar Language," Proceedings of 2019 the 9th International Workshop on Computer Science and Engineering WCSE_2019_SPRING, pp. 95-100, Yangon, Myanmar, February 27-March 1, 2019.