1 code implementation • 31 Aug 2023 • Lucas Bandarkar, Davis Liang, Benjamin Muller, Mikel Artetxe, Satya Narayan Shukla, Donald Husa, Naman Goyal, Abhinandan Krishnan, Luke Zettlemoyer, Madian Khabsa
We use this dataset to evaluate the capabilities of multilingual masked language models (MLMs) and large language models (LLMs).
no code implementations • ACL 2021 • Philippe Laban, Luke Dai, Lucas Bandarkar, Marti A. Hearst
The Shuffle Test is the most common task to evaluate whether NLP models can measure coherence in text.
1 code implementation • ACL 2021 • Philippe Laban, Luke Dai, Lucas Bandarkar, Marti A. Hearst
The Shuffle Test is the most common task to evaluate whether NLP models can measure coherence in text.
1 code implementation • NAACL 2021 • Philippe Laban, Lucas Bandarkar, Marti A. Hearst
Recent progress in Natural Language Understanding (NLU) has seen the latest models outperform human performance on many standard tasks.