1 code implementation • NAACL 2022 • Ramit Sawhney, Ritesh Soun, Shrey Pandit, Megh Thakkar, Sarvagya Malaviya, Yuval Pinter
CIAug achieves state-of-the-art results over existing interpolative augmentation methods on 10 benchmark datasets across 4 languages in text classification and named-entity recognition tasks.
no code implementations • EMNLP (MRL) 2021 • Ramit Sawhney, Megh Thakkar, Shrey Pandit, Debdoot Mukherjee, Lucie Flek
Interpolation-based regularisation methods have proven to be effective for various tasks and modalities.
no code implementations • EMNLP (MRL) 2021 • Megh Thakkar, Vishwa Shah, Ramit Sawhney, Debdoot Mukherjee
There have been efforts in cross-lingual transfer learning for various tasks.
1 code implementation • EMNLP 2021 • Ramit Sawhney, Megh Thakkar, Shivam Agarwal, Di Jin, Diyi Yang, Lucie Flek
Interpolation-based regularisation methods for data augmentation have proven to be effective for various tasks and modalities.
1 code implementation • ACL 2022 • Ramit Sawhney, Megh Thakkar, Shrey Pandit, Ritesh Soun, Di Jin, Diyi Yang, Lucie Flek
Interpolation-based regularisation methods such as Mixup, which generate virtual training samples, have proven to be effective for various tasks and modalities. We extend Mixup and propose DMix, an adaptive distance-aware interpolative Mixup that selects samples based on their diversity in the embedding space.
2 code implementations • 12 Mar 2024 • Alexandre Drouin, Maxime Gasse, Massimo Caccia, Issam H. Laradji, Manuel Del Verme, Tom Marty, Léo Boisvert, Megh Thakkar, Quentin Cappart, David Vazquez, Nicolas Chapados, Alexandre Lacoste
We study the use of large language model-based agents for interacting with software via web browsers.
no code implementations • 2 Nov 2023 • Megh Thakkar, Tolga Bolukbasi, Sriram Ganapathy, Shikhar Vashishth, Sarath Chandar, Partha Talukdar
Once the pre-training corpus has been assembled, all data samples in the corpus are treated with equal importance during LM pre-training.
1 code implementation • 11 May 2023 • Han Cheol Moon, Shafiq Joty, Ruochen Zhao, Megh Thakkar, Xu Chi
Large-scale pre-trained language models have shown outstanding performance in a variety of NLP tasks.
1 code implementation • 16 Nov 2022 • Linlin Liu, Xingxuan Li, Megh Thakkar, Xin Li, Shafiq Joty, Luo Si, Lidong Bing
Due to the huge amount of parameters, fine-tuning of pretrained language models (PLMs) is prone to overfitting in the low resource scenarios.
2 code implementations • ACL 2022 • Shankar Kantharaj, Rixie Tiffany Ko Leong, Xiang Lin, Ahmed Masry, Megh Thakkar, Enamul Hoque, Shafiq Joty
We also introduce a number of state-of-the-art neural models as baselines that utilize image captioning and data-to-text generation techniques to tackle two problem variations: one assumes the underlying data table of the chart is available while the other needs to extract data from chart images.