Integrate Words Internal Information to Improve Word Embeddings

WCSE 2019 SUMMER ISBN: 978-981-14-1684-2
DOI: 10.18178/wcse.2019.06.075

Chuanxiang Tang, Yun Tang

Abstract— we propose a method of improving word embeddings by fusing the hidden information within words, which is different from the traditional method of directly using morphological information on the surface of words to train word embeddings. Based on the average principle and two attention mechanisms, we propose to use the hidden information inside words, which is called the implied meanings of morphemes of words in this paper, and propose six implied meaning embedding models. The comparative experiments are carried out on two basic Natural Language Processing tasks, which prove that our models have more advantages than the classical models represented by CBOW, Skip-Gram and GloVe in mining semantic information. In addition, exploring the relationship between the importance of synthetic implied meanings and the word itself.

Index Terms— average principle, attention mechanism, word embedding, fusion.

Chuanxiang Tang, Yun Tang
School of Software, University of Science and Technology of China, CHINA

[Download]

Cite: Chuanxiang Tang, Yun Tang, "Integrate Words Internal Information to Improve Word Embeddings," Proceedings of 2019 the 9th International Workshop on Computer Science and Engineering, pp. 508-514 508, Hong Kong, 15-17 June, 2019.

PREVIOUS PAPER
Location Context Ontology Model based on Ubiquitous Computing Environment

NEXT PAPER
A Method for the measurement of FPGA software safety in its whole life cycle