no code implementations • 19 Feb 2024 • Xinbo Wu, Lav R. Varshney
Even though large language models (LLMs) have demonstrated remarkable capability in solving various natural language tasks, the capability of an LLM to follow human instructions is still a concern.
no code implementations • 9 Oct 2023 • Xinbo Wu, Lav R. Varshney
Focused on the training process, here we establish a meta-learning view of the Transformer architecture when trained for the causal language modeling task, by explicating an inner optimization process within the Transformer.