no code implementations • 26 Apr 2022 • Cong Zhou, Yong Dai, Duyu Tang, Enbo Zhao, Zhangyin Feng, Li Kuang, Shuming Shi
We achieve this by introducing a special token \texttt{[null]}, the prediction of which stands for the non-existence of a word.
Language Modelling Masked Language Modeling +1