no code implementations • 14 Mar 2024 • Uriel Singer, Amit Zohar, Yuval Kirstain, Shelly Sheynin, Adam Polyak, Devi Parikh, Yaniv Taigman
We introduce Emu Video Edit (EVE), a model that establishes a new state-of-the art in video editing without relying on any supervised video editing data.
no code implementations • 16 Nov 2023 • Shelly Sheynin, Adam Polyak, Uriel Singer, Yuval Kirstain, Amit Zohar, Oron Ashual, Devi Parikh, Yaniv Taigman
Lastly, to facilitate a more rigorous and informed assessment of instructable image editing models, we release a new challenging and versatile benchmark that includes seven different image editing tasks.
1 code implementation • 5 Sep 2023 • Lili Yu, Bowen Shi, Ramakanth Pasunuru, Benjamin Muller, Olga Golovneva, Tianlu Wang, Arun Babu, Binh Tang, Brian Karrer, Shelly Sheynin, Candace Ross, Adam Polyak, Russell Howes, Vasu Sharma, Puxin Xu, Hovhannes Tamoyan, Oron Ashual, Uriel Singer, Shang-Wen Li, Susan Zhang, Richard James, Gargi Ghosh, Yaniv Taigman, Maryam Fazel-Zarandi, Asli Celikyilmaz, Luke Zettlemoyer, Armen Aghajanyan
It is also a general-purpose model that can do both text-to-image and image-to-text generation, allowing us to introduce self-contained contrastive decoding methods that produce high-quality outputs.
Ranked #2 on Text-to-Image Generation on MS COCO
no code implementations • 26 Jan 2023 • Uriel Singer, Shelly Sheynin, Adam Polyak, Oron Ashual, Iurii Makarov, Filippos Kokkinos, Naman Goyal, Andrea Vedaldi, Devi Parikh, Justin Johnson, Yaniv Taigman
We present MAV3D (Make-A-Video3D), a method for generating three-dimensional dynamic scenes from text descriptions.
no code implementations • CVPR 2023 • Omri Avrahami, Thomas Hayes, Oran Gafni, Sonal Gupta, Yaniv Taigman, Devi Parikh, Dani Lischinski, Ohad Fried, Xi Yin
Due to lack of large-scale datasets that have a detailed textual description for each region in the image, we choose to leverage the current large-scale text-to-image datasets and base our approach on a novel CLIP-based spatio-textual representation, and show its effectiveness on two state-of-the-art diffusion models: pixel-based and latent-based.
1 code implementation • 30 Sep 2022 • Felix Kreuk, Gabriel Synnaeve, Adam Polyak, Uriel Singer, Alexandre Défossez, Jade Copet, Devi Parikh, Yaniv Taigman, Yossi Adi
Finally, we explore the ability of the proposed method to generate audio continuation conditionally and unconditionally.
Ranked #12 on Audio Generation on AudioCaps
2 code implementations • 29 Sep 2022 • Uriel Singer, Adam Polyak, Thomas Hayes, Xi Yin, Jie An, Songyang Zhang, Qiyuan Hu, Harry Yang, Oron Ashual, Oran Gafni, Devi Parikh, Sonal Gupta, Yaniv Taigman
We propose Make-A-Video -- an approach for directly translating the tremendous recent progress in Text-to-Image (T2I) generation to Text-to-Video (T2V).
Ranked #3 on Text-to-Video Generation on MSR-VTT (CLIP-FID metric)
no code implementations • 6 Apr 2022 • Shelly Sheynin, Oron Ashual, Adam Polyak, Uriel Singer, Oran Gafni, Eliya Nachmani, Yaniv Taigman
Recent text-to-image models have achieved impressive results.
Ranked #34 on Text-to-Image Generation on MS COCO
1 code implementation • 24 Mar 2022 • Oran Gafni, Adam Polyak, Oron Ashual, Shelly Sheynin, Devi Parikh, Yaniv Taigman
Recent text-to-image generation methods provide a simple yet exciting conversion capability between text and image domains.
Ranked #20 on Text-to-Image Generation on MS COCO (using extra training data)
no code implementations • 31 Jan 2021 • Adam Polyak, Lior Wolf, Yossi Adi, Ori Kabeli, Yaniv Taigman
Speech enhancement has seen great improvement in recent years mainly through contributions in denoising, speaker separation, and dereverberation methods that mostly deal with environmental effects on vocal audio.
no code implementations • 6 Aug 2020 • Adam Polyak, Lior Wolf, Yossi Adi, Yaniv Taigman
We present a wav-to-wav generative model for the task of singing voice conversion from any identity.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • ICCV 2019 • Oran Gafni, Lior Wolf, Yaniv Taigman
We propose a method for face de-identification that enables fully automatic video modification at high frame rates.
no code implementations • ICLR 2019 • Noam Mor, Lior Wolf, Adam Polyak, Yaniv Taigman
We present a method for translating music across musical instruments and styles.
no code implementations • 18 Apr 2019 • Adam Polyak, Lior Wolf, Yaniv Taigman
We present a fully convolutional wav-to-wav network for converting between speakers' voices, without relying on text.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • ICLR 2020 • Oran Gafni, Lior Wolf, Yaniv Taigman
The second network maps the current pose, the new pose, and a given background, to an output frame.
no code implementations • 29 Jul 2018 • Doron Sobol, Lior Wolf, Yaniv Taigman
For example, given a video frame in the target game, we map it to an analogous state in the source game and then attempt to play using a trained policy learned for the source game.
4 code implementations • 21 May 2018 • Noam Mor, Lior Wolf, Adam Polyak, Yaniv Taigman
We present a method for translating music across musical instruments, genres, and styles.
no code implementations • ICML 2018 • Eliya Nachmani, Adam Polyak, Yaniv Taigman, Lior Wolf
Learning-based Text To Speech systems have the potential to generalize from one speaker to the next and thus require a relatively short sample of any new voice.
2 code implementations • ICLR 2018 • Yaniv Taigman, Lior Wolf, Adam Polyak, Eliya Nachmani
We present a new neural text to speech (TTS) method that is able to transform text to speech in voices that are sampled in the wild.
no code implementations • ICCV 2017 • Lior Wolf, Yaniv Taigman, Adam Polyak
We study the problem of mapping an input image to a tied pair consisting of a vector of parameters and an image that is created using a graphical engine from the vector of parameters.
6 code implementations • 7 Nov 2016 • Yaniv Taigman, Adam Polyak, Lior Wolf
We study the problem of transferring a sample in one domain to an analog sample in another domain.
no code implementations • CVPR 2015 • Ning Zhang, Manohar Paluri, Yaniv Taigman, Rob Fergus, Lubomir Bourdev
We explore the task of recognizing peoples' identities in photo albums in an unconstrained setting.
2 code implementations • Conference on Computer Vision and Pattern Recognition (CVPR) 2014 • Yaniv Taigman, Ming Yang, Marc’ Aurelio Ranzato, Lior Wolf
In modern face recognition, the conventional pipeline consists of four stages: detect => align => represent => classify.
Ranked #1 on 3D Face Modelling on LFW
no code implementations • CVPR 2015 • Yaniv Taigman, Ming Yang, Marc'Aurelio Ranzato, Lior Wolf
Scaling machine learning methods to very large datasets has attracted considerable attention in recent years, thanks to easy access to ubiquitous sensing and data from the web.
no code implementations • 20 Dec 2013 • Omry Yadan, Keith Adams, Yaniv Taigman, Marc'Aurelio Ranzato
In this work we evaluate different approaches to parallelize computation of convolutional neural networks across several GPUs.
no code implementations • 4 Aug 2011 • Yaniv Taigman, Lior Wolf
We employ the face recognition technology developed in house at face. com to a well accepted benchmark and show that without any tuning we are able to considerably surpass state of the art results.