no code implementations • 30 Apr 2024 • Yasumasa Onoe, Sunayana Rane, Zachary Berger, Yonatan Bitton, Jaemin Cho, Roopal Garg, Alexander Ku, Zarana Parekh, Jordi Pont-Tuset, Garrett Tanzer, Su Wang, Jason Baldridge
Vision-language datasets are vital for both text-to-image (T2I) and image-to-text (I2T) research.
no code implementations • CVPR 2023 • Aishwarya Kamath, Peter Anderson, Su Wang, Jing Yu Koh, Alexander Ku, Austin Waters, Yinfei Yang, Jason Baldridge, Zarana Parekh
Recent studies in Vision-and-Language Navigation (VLN) train RL agents to execute natural-language navigation instructions in photorealistic environments, as a step towards robots that can follow human instructions.
Ranked #1 on Vision and Language Navigation on RxR (using extra training data)
2 code implementations • 22 Jun 2022 • Jiahui Yu, Yuanzhong Xu, Jing Yu Koh, Thang Luong, Gunjan Baid, ZiRui Wang, Vijay Vasudevan, Alexander Ku, Yinfei Yang, Burcu Karagol Ayan, Ben Hutchinson, Wei Han, Zarana Parekh, Xin Li, Han Zhang, Jason Baldridge, Yonghui Wu
We present the Pathways Autoregressive Text-to-Image (Parti) model, which generates high-fidelity photorealistic images and supports content-rich synthesis involving complex compositions and world knowledge.
Ranked #1 on Text-to-Image Generation on LAION COCO
4 code implementations • 11 Feb 2021 • Chao Jia, Yinfei Yang, Ye Xia, Yi-Ting Chen, Zarana Parekh, Hieu Pham, Quoc V. Le, YunHsuan Sung, Zhen Li, Tom Duerig
In this paper, we leverage a noisy dataset of over one billion image alt-text pairs, obtained without expensive filtering or post-processing steps in the Conceptual Captions dataset.
Ranked #1 on Image Classification on VTAB-1k (using extra training data)
no code implementations • Asian Chapter of the Association for Computational Linguistics 2020 • Gustavo Hernandez Abrego, Bowen Liang, Wei Wang, Zarana Parekh, Yinfei Yang, YunHsuan Sung
We evaluate our methods on de-noising parallel texts and training neural machine translation models.
1 code implementation • ACL 2021 • Parker Riley, Noah Constant, Mandy Guo, Girish Kumar, David Uthus, Zarana Parekh
Unlike previous approaches requiring style-labeled training data, our method makes use of readily-available unlabeled text by relying on the implicit connection in style between adjacent sentences, and uses labeled data only at inference time.
no code implementations • 28 Sep 2020 • Parker Riley, Noah Constant, Mandy Guo, Girish Kumar, David Uthus, Zarana Parekh
We present a novel approach to the challenging problem of label-free text style transfer.
2 code implementations • EACL 2021 • Zarana Parekh, Jason Baldridge, Daniel Cer, Austin Waters, Yinfei Yang
By supporting multi-modal retrieval training and evaluation, image captioning datasets have spurred remarkable progress on representation learning.
no code implementations • ACL 2020 • Wei Wang, Ye Tian, Jiquan Ngiam, Yinfei Yang, Isaac Caswell, Zarana Parekh
Most data selection research in machine translation focuses on improving a single domain.
1 code implementation • NAACL 2019 • Soham Ghosh, Anuva Agarwal, Zarana Parekh, Alexander Hauptmann
The task of retrieving clips within videos based on a given natural language query requires cross-modal reasoning over multiple frames.
1 code implementation • EMNLP 2017 • Aditya Sharma, Zarana Parekh, Partha Talukdar
RLIE-DQN is a recently proposed Reinforcement Learning-based Information Extraction (IE) technique which is able to incorporate external evidence during the extraction process.