Word embedding re-examined: is the symmetrical factorization optimal?

25 Sep 2019  ·  Zhichao Han, Jia Li, Xu Li, Hong Cheng ·

As observed in previous works, many word embedding methods exhibit two interesting properties: (1) words having similar semantic meanings are embedded closely; (2) analogy structure exists in the embedding space, such that ''emph{Paris} is to \emph{France} as \emph{Berlin} is to \emph{Germany}''. We theoretically analyze the inner mechanism leading to these nice properties. Specifically, the embedding can be viewed as a linear transformation from the word-context co-occurrence space to the embedding space. We reveal how the relative distances between nodes change during this transforming process. Such linear transformation will result in these good properties. Based on the analysis, we also provide the answer to a question whether the symmetrical factorization (e.g., \texttt{word2vec}) is better than traditional SVD method. We propose a method to improve the embedding further. The experiments on real datasets verify our analysis.

PDF Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here