no code implementations • 3 Apr 2024 • Sehyun Choi
Motivated by this approach, we propose Cross-Architecture Transfer Learning (XATL), in which the weights of the shared components between LCI and self-attention-based transformers, such as layernorms, MLPs, input/output embeddings, are directly transferred to the new architecture from already pre-trained model parameters.
1 code implementation • 16 Feb 2024 • Zhaowei Wang, Wei Fan, Qing Zong, Hongming Zhang, Sehyun Choi, Tianqing Fang, Xin Liu, Yangqiu Song, Ginny Y. Wong, Simon See
Abstraction ability is crucial in human intelligence, which can also benefit various tasks in NLP study.
1 code implementation • 15 Nov 2023 • Zhaowei Wang, Haochen Shi, Weiqi Wang, Tianqing Fang, Hongming Zhang, Sehyun Choi, Xin Liu, Yangqiu Song
Cognitive research indicates that abstraction ability is essential in human intelligence, which remains under-explored in language models.
1 code implementation • 13 Oct 2023 • Sehyun Choi, Tianqing Fang, Zhaowei Wang, Yangqiu Song
Large Language Models (LLMs) have demonstrated remarkable human-level natural language generation capabilities.
1 code implementation • 20 Apr 2023 • Tianqing Fang, Quyet V. Do, Sehyun Choi, Weiqi Wang, Yangqiu Song
Populating Commonsense Knowledge Bases (CSKB) is an important yet hard task in NLP, as it tackles knowledge from external sources with unseen events and entities.
2 code implementations • EMNLP 2021 • Tianqing Fang, Weiqi Wang, Sehyun Choi, Shibo Hao, Hongming Zhang, Yangqiu Song, Bin He
Experimental results show that generalizing commonsense reasoning on unseen assertions is inherently a hard task.