The key idea is to utilize word sememes to capture exact meanings of a word within specific contexts accurately.
Recently proposed Skip-gram model is a powerful method for learning high-dimensional word representations that capture rich semantic relationships between words.
This paper presents a new graph-based approach that induces synsets using synonymy dictionaries and word embeddings.
Word sense induction (WSI) is the task of unsupervised clustering of word usages within a sentence to distinguish senses.
An established method for Word Sense Induction (WSI) uses a language model to predict probable substitutes for target words, and induces senses by clustering these resulting substitute vectors.
Evaluating these methods is also problematic, as rigorous quantitative evaluations in this space is limited, especially when compared with single-sense embeddings.
Thus, we aim to eliminate these requirements and solve the sense granularity problem by proposing AutoSense, a latent variable model based on two observations: (1) senses are represented as a distribution over topics, and (2) senses generate pairings between the target word and its neighboring word.
The paper reports our participation in the shared task on word sense induction and disambiguation for the Russian language (RUSSE-2018).