no code implementations • 8 Feb 2024 • David Yan, Winnie Zhang, Luxin Zhang, Anmol Kalia, Dingkang Wang, Ankit Ramchandani, Miao Liu, Albert Pumarola, Edgar Schoenfeld, Elliot Blanchard, Krishna Narni, Yaqiao Luo, Lawrence Chen, Guan Pang, Ali Thabet, Peter Vajda, Amy Bearman, Licheng Yu
Our model is built on top of the state-of-the-art Emu text-to-image model, with the addition of temporal layers to model motion.
no code implementations • 17 Nov 2023 • Animesh Sinha, Bo Sun, Anmol Kalia, Arantxa Casanova, Elliot Blanchard, David Yan, Winnie Zhang, Tony Nelli, Jiahui Chen, Hardik Shah, Licheng Yu, Mitesh Kumar Singh, Ankit Ramchandani, Maziar Sanjabi, Sonal Gupta, Amy Bearman, Dhruv Mahajan
Evaluation results show our method improves visual quality by 14%, prompt alignment by 16. 2% and scene diversity by 15. 3%, compared to prompt engineering the base Emu model for stickers generation.
1 code implementation • 26 Oct 2022 • Seungwhan Moon, Andrea Madotto, Zhaojiang Lin, Alireza Dirafzoon, Aparajita Saraf, Amy Bearman, Babak Damavandi
We present IMU2CLIP, a novel pre-training approach to align Inertial Measurement Unit (IMU) motion sensor recordings with video and text, by projecting them into the joint representation space of Contrastive Language-Image Pre-training (CLIP).
1 code implementation • CVPR 2021 • Zihang Meng, Licheng Yu, Ning Zhang, Tamara Berg, Babak Damavandi, Vikas Singh, Amy Bearman
Learning the grounding of each word is challenging, due to noise in the human-provided traces and the presence of words that cannot be meaningfully visually grounded.
1 code implementation • 6 Jun 2015 • Amy Bearman, Olga Russakovsky, Vittorio Ferrari, Li Fei-Fei
The semantic image segmentation task presents a trade-off between test time accuracy and training-time annotation cost.