no code implementations • 1 Apr 2024 • Wilson Wu, John X. Morris, Lionel Levine
Do transformers "think ahead" during inference at a given position?
1 code implementation • 2 Feb 2024 • Zach Nussbaum, John X. Morris, Brandon Duderstadt, Andriy Mulyar
This technical report describes the training of nomic-embed-text-v1, the first fully reproducible, open-source, open-weights, open-data, 8192 context length English text embedding model that outperforms both OpenAI Ada-002 and OpenAI text-embedding-3-small on short and long-context tasks.
2 code implementations • 22 Nov 2023 • John X. Morris, Wenting Zhao, Justin T. Chiu, Vitaly Shmatikov, Alexander M. Rush
We consider the problem of language model inversion and show that next-token probabilities contain a surprising amount of information about the preceding text.
2 code implementations • 21 Oct 2023 • John X. Morris, Chandan Singh, Alexander M. Rush, Jianfeng Gao, Yuntian Deng
Prompting language models (LMs) is the main interface for applying them to new tasks.
1 code implementation • 10 Oct 2023 • John X. Morris, Volodymyr Kuleshov, Vitaly Shmatikov, Alexander M. Rush
How much private information do text embeddings reveal about the original text?
1 code implementation • 20 Oct 2022 • John X. Morris, Justin T. Chiu, Ramin Zabih, Alexander M. Rush
We propose an unsupervised deidentification method that masks words that leak personally-identifying information.
2 code implementations • 4 Oct 2022 • Chandan Singh, John X. Morris, Jyoti Aneja, Alexander M. Rush, Jianfeng Gao
Large language models (LLMs) have displayed an impressive ability to harness natural language to perform complex tasks.
1 code implementation • 5 Oct 2020 • John X. Morris
In these methods, a valid adversarial example fools the model being attacked, and is determined to be semantically or syntactically valid by a second model.
2 code implementations • EMNLP (BlackboxNLP) 2020 • Jin Yong Yoo, John X. Morris, Eli Lifland, Yanjun Qi
We study the behavior of several black-box search algorithms used for generating adversarial examples for natural language processing (NLP) tasks.
2 code implementations • EMNLP 2020 • John X. Morris, Eli Lifland, Jin Yong Yoo, Jake Grigsby, Di Jin, Yanjun Qi
TextAttack also includes data augmentation and adversarial training modules for using components of adversarial attacks to improve model accuracy and robustness.
2 code implementations • Findings of the Association for Computational Linguistics 2020 • John X. Morris, Eli Lifland, Jack Lanchantin, Yangfeng Ji, Yanjun Qi
State-of-the-art attacks on NLP models lack a shared definition of a what constitutes a successful attack.