1 code implementation • 21 Oct 2022 • Shengyuan Hou, Jushi Kai, Haotian Xue, Bingyu Zhu, Bo Yuan, Longtao Huang, Xinbing Wang, Zhouhan Lin
Recent works have revealed that Transformers are implicitly learning the syntactic information in its lower layers from data, albeit is highly dependent on the quality and scale of the training data.
1 code implementation • Findings (EMNLP) 2021 • Wei Wang, Boxin Wang, Ning Shi, Jinfeng Li, Bingyu Zhu, Xiangyu Liu, Rong Zhang
Deep learning models exhibit a preference for statistical fitting over logical reasoning.