no code implementations • 27 Jan 2024 • Kuleen Sasse, Samuel Barham, Efsun Sarioglu Kayi, Edward W. Staley
While large language models (LLMs) are extremely capable at text generation, their outputs are still distinguishable from human-authored text.
no code implementations • 13 Jul 2023 • Samuel Barham, Orion Weller, Michelle Yuan, Kenton Murray, Mahsa Yarmohammadi, Zhengping Jiang, Siddharth Vashishtha, Alexander Martin, Anqi Liu, Aaron Steven White, Jordan Boyd-Graber, Benjamin Van Durme
To foster the development of new models for collaborative AI-assisted report generation, we introduce MegaWika, consisting of 13 million Wikipedia articles in 50 diverse languages, along with their 71 million referenced source materials.
no code implementations • 29 Apr 2023 • James Mayfield, Eugene Yang, Dawn Lawrie, Samuel Barham, Orion Weller, Marc Mason, Suraj Nair, Scott Miller
By repeating this process, collections of arbitrary size can be created in the style of MS MARCO but using naturally-occurring documents in any desired genre and domain of discourse.
no code implementations • 30 May 2019 • Samuel Barham, Soheil Feizi
SPGD imposes a directional regularization constraint on input perturbations by projecting them onto the directions to nearby word embeddings with highest cosine similarities.