Learning Better Word Embedding by Asymmetric Low-Rank Projection of Knowledge Graph

19 May 2015  ·  Fei Tian, Bin Gao, Enhong Chen, Tie-Yan Liu ·

Word embedding, which refers to low-dimensional dense vector representations of natural words, has demonstrated its power in many natural language processing tasks. However, it may suffer from the inaccurate and incomplete information contained in the free text corpus as training data. To tackle this challenge, there have been quite a few works that leverage knowledge graphs as an additional information source to improve the quality of word embedding. Although these works have achieved certain success, they have neglected some important facts about knowledge graphs: (i) many relationships in knowledge graphs are \emph{many-to-one}, \emph{one-to-many} or even \emph{many-to-many}, rather than simply \emph{one-to-one}; (ii) most head entities and tail entities in knowledge graphs come from very different semantic spaces. To address these issues, in this paper, we propose a new algorithm named ProjectNet. ProjecNet models the relationships between head and tail entities after transforming them with different low-rank projection matrices. The low-rank projection can allow non \emph{one-to-one} relationships between entities, while different projection matrices for head and tail entities allow them to originate in different semantic spaces. The experimental results demonstrate that ProjectNet yields more accurate word embedding than previous works, thus leads to clear improvements in various natural language processing tasks.

PDF Abstract
No code implementations yet. Submit your code now

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here