no code implementations • 17 Sep 2023 • Thuat Nguyen, Chien Van Nguyen, Viet Dac Lai, Hieu Man, Nghia Trung Ngo, Franck Dernoncourt, Ryan A. Rossi, Thien Huu Nguyen
However, when it comes to training datasets for these LLMs, especially the recent state-of-the-art models, they are often not fully disclosed.
2 code implementations • 29 Jul 2023 • Viet Dac Lai, Chien Van Nguyen, Nghia Trung Ngo, Thuat Nguyen, Franck Dernoncourt, Ryan A. Rossi, Thien Huu Nguyen
Okapi introduces instruction and response-ranked data in 26 diverse languages to facilitate the experiments and development of future multilingual LLM research.
no code implementations • 12 Apr 2023 • Viet Dac Lai, Nghia Trung Ngo, Amir Pouran Ben Veyseh, Hieu Man, Franck Dernoncourt, Trung Bui, Thien Huu Nguyen
The answer to this question requires a thorough evaluation of ChatGPT over multiple tasks with diverse languages and large datasets (i. e., beyond reported anecdotes), which is still missing or limited in current research.
1 code implementation • NAACL (ACL) 2022 • Minh Van Nguyen, Nghia Trung Ngo, Bonan Min, Thien Huu Nguyen
FAMIE is designed to address a fundamental problem in existing AL frameworks where annotators need to wait for a long time between annotation batches due to the time-consuming nature of model training and data selection at each AL iteration.