Ruminating Word Representations with Random Noise Masking

1 Jan 2021  ·  Hwiyeol Jo, Byoung-Tak Zhang ·

We introduce a training method for better word representation and performance, which we call \textbf{GraVeR} (\textbf{Gra}dual \textbf{Ve}ctor \textbf{R}umination). The method is to gradually and iteratively add random noises and bias to word embeddings after training a model, and re-train the model from scratch but initialize with the noised word embeddings. Through the re-training process, some of noises can be compensated and other noises can be utilized to learn better representations. As a result, we can get word representations further fine-tuned and specialized in the task. On six text classification tasks, our method improves model performances with a large gap. When GraVeR is combined with other regularization techniques, it shows further improvements. Lastly, we investigate the usefulness of GraVeR for pretraining by training data.

PDF Abstract

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here