Search Results for author: Aliaksandr Siarohin

Found 38 papers, 20 papers with code

Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers

no code implementations • 29 Feb 2024 • Tsai-Shien Chen, Aliaksandr Siarohin, Willi Menapace, Ekaterina Deyneka, Hsiang-wei Chao, Byung Eun Jeon, Yuwei Fang, Hsin-Ying Lee, Jian Ren, Ming-Hsuan Yang, Sergey Tulyakov

Next, we finetune a retrieval model on a small subset where the best caption of each video is manually selected and then employ the model in the whole dataset to select the best caption as the annotation.

Retrieval Text Retrieval +3

Paper
Add Code

Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis

no code implementations • 22 Feb 2024 • Willi Menapace, Aliaksandr Siarohin, Ivan Skorokhodov, Ekaterina Deyneka, Tsai-Shien Chen, Anil Kag, Yuwei Fang, Aleksei Stoliar, Elisa Ricci, Jian Ren, Sergey Tulyakov

Since video content is highly redundant, we argue that naively bringing advances of image models to the video generation domain reduces motion fidelity, visual quality and impairs scalability.

Ranked #1 on Text-to-Video Generation on MSR-VTT

Image Generation Text-to-Video Generation +1

Paper
Add Code

SPAD : Spatially Aware Multiview Diffusers

no code implementations • 7 Feb 2024 • Yash Kant, Ziyi Wu, Michael Vasilkovsky, Guocheng Qian, Jian Ren, Riza Alp Guler, Bernard Ghanem, Sergey Tulyakov, Igor Gilitschenski, Aliaksandr Siarohin

We present SPAD, a novel approach for creating consistent multi-view images from text prompts or single images.

3D Generation Novel View Synthesis +1

Paper
Add Code

AToM: Amortized Text-to-Mesh using 2D Diffusion

no code implementations • 1 Feb 2024 • Guocheng Qian, Junli Cao, Aliaksandr Siarohin, Yash Kant, Chaoyang Wang, Michael Vasilkovsky, Hsin-Ying Lee, Yuwei Fang, Ivan Skorokhodov, Peiye Zhuang, Igor Gilitschenski, Jian Ren, Bernard Ghanem, Kfir Aberman, Sergey Tulyakov

We introduce Amortized Text-to-Mesh (AToM), a feed-forward text-to-mesh framework optimized across multiple text prompts simultaneously.

Text to 3D

Paper
Add Code

Diffusion Priors for Dynamic View Synthesis from Monocular Videos

no code implementations • 10 Jan 2024 • Chaoyang Wang, Peiye Zhuang, Aliaksandr Siarohin, Junli Cao, Guocheng Qian, Hsin-Ying Lee, Sergey Tulyakov

Dynamic novel view synthesis aims to capture the temporal evolution of visual content within videos.

Novel View Synthesis

Paper
Add Code

SceneWiz3D: Towards Text-guided 3D Scene Composition

no code implementations • 13 Dec 2023 • Qihang Zhang, Chaoyang Wang, Aliaksandr Siarohin, Peiye Zhuang, Yinghao Xu, Ceyuan Yang, Dahua Lin, Bolei Zhou, Sergey Tulyakov, Hsin-Ying Lee

We are witnessing significant breakthroughs in the technology for generating 3D objects from text.

Text to 3D

Paper
Add Code

iNVS: Repurposing Diffusion Inpainters for Novel View Synthesis

no code implementations • 24 Oct 2023 • Yash Kant, Aliaksandr Siarohin, Michael Vasilkovsky, Riza Alp Guler, Jian Ren, Sergey Tulyakov, Igor Gilitschenski

Our approach focuses on maximizing the reuse of visible pixels from the source image.

Novel View Synthesis

Paper
Add Code

HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion

no code implementations • 12 Oct 2023 • Xian Liu, Jian Ren, Aliaksandr Siarohin, Ivan Skorokhodov, Yanyu Li, Dahua Lin, Xihui Liu, Ziwei Liu, Sergey Tulyakov

Our model enforces the joint learning of image appearance, spatial relationship, and geometry in a unified network, where each branch in the model complements to each other with both structural awareness and textural richness.

Image Generation

Paper
Add Code

AutoDecoding Latent 3D Diffusion Models

1 code implementation • NeurIPS 2023 • Evangelos Ntavelis, Aliaksandr Siarohin, Kyle Olszewski, Chaoyang Wang, Luc van Gool, Sergey Tulyakov

We present a novel approach to the generation of static and articulated 3D assets that has a 3D autodecoder at its core.

131

Paper
Code

Text-Guided Synthesis of Eulerian Cinemagraphs

1 code implementation • 6 Jul 2023 • Aniruddha Mahapatra, Aliaksandr Siarohin, Hsin-Ying Lee, Sergey Tulyakov, Jun-Yan Zhu

We introduce Text2Cinemagraph, a fully automated method for creating cinemagraphs from text descriptions - an especially challenging task when prompts feature imaginary elements and artistic styles, given the complexity of interpreting the semantics and motions of these images.

Image Animation

344

Paper
Code

Magic123: One Image to High-Quality 3D Object Generation Using Both 2D and 3D Diffusion Priors

1 code implementation • 30 Jun 2023 • Guocheng Qian, Jinjie Mai, Abdullah Hamdi, Jian Ren, Aliaksandr Siarohin, Bing Li, Hsin-Ying Lee, Ivan Skorokhodov, Peter Wonka, Sergey Tulyakov, Bernard Ghanem

We present Magic123, a two-stage coarse-to-fine approach for high-quality, textured 3D meshes generation from a single unposed image in the wild using both2D and 3D priors.

Image to 3D

1,457

Paper
Code

Promptable Game Models: Text-Guided Game Simulation via Masked Diffusion Models

no code implementations • 23 Mar 2023 • Willi Menapace, Aliaksandr Siarohin, Stéphane Lathuilière, Panos Achlioptas, Vladislav Golyanik, Sergey Tulyakov, Elisa Ricci

Most captivatingly, our PGM unlocks the director's mode, where the game is played by specifying goals for the agents in the form of a prompt.

Navigate

Paper
Add Code

3D generation on ImageNet

no code implementations • 2 Mar 2023 • Ivan Skorokhodov, Aliaksandr Siarohin, Yinghao Xu, Jian Ren, Hsin-Ying Lee, Peter Wonka, Sergey Tulyakov

Existing 3D-from-2D generators are typically designed for well-curated single-category datasets, where all the objects have (approximately) the same scale, 3D location, and orientation, and the camera always points to the center of the scene.

3D Generation

Paper
Add Code

Invertible Neural Skinning

no code implementations • CVPR 2023 • Yash Kant, Aliaksandr Siarohin, Riza Alp Guler, Menglei Chai, Jian Ren, Sergey Tulyakov, Igor Gilitschenski

Next, we combine PIN with a differentiable LBS module to build an expressive and end-to-end Invertible Neural Skinning (INS) pipeline.

Paper
Add Code

Unsupervised Volumetric Animation

no code implementations • CVPR 2023 • Aliaksandr Siarohin, Willi Menapace, Ivan Skorokhodov, Kyle Olszewski, Jian Ren, Hsin-Ying Lee, Menglei Chai, Sergey Tulyakov

We propose a novel approach for unsupervised 3D animation of non-rigid deformable objects.

Keypoint Estimation Novel View Synthesis

Paper
Add Code

InfiniCity: Infinite-Scale City Synthesis

no code implementations • ICCV 2023 • Chieh Hubert Lin, Hsin-Ying Lee, Willi Menapace, Menglei Chai, Aliaksandr Siarohin, Ming-Hsuan Yang, Sergey Tulyakov

Toward infinite-scale 3D city synthesis, we propose a novel framework, InfiniCity, which constructs and renders an unconstrainedly large and 3D-grounded environment from random noises.

Image Generation Neural Rendering

Paper
Add Code

3DAvatarGAN: Bridging Domains for Personalized Editable Avatars

no code implementations • CVPR 2023 • Rameen Abdal, Hsin-Ying Lee, Peihao Zhu, Menglei Chai, Aliaksandr Siarohin, Peter Wonka, Sergey Tulyakov

Finally, we propose a novel inversion method for 3D-GANs linking the latent spaces of the source and the target domains.

Paper
Add Code

DisCoScene: Spatially Disentangled Generative Radiance Fields for Controllable 3D-aware Scene Synthesis

no code implementations • CVPR 2023 • Yinghao Xu, Menglei Chai, Zifan Shi, Sida Peng, Ivan Skorokhodov, Aliaksandr Siarohin, Ceyuan Yang, Yujun Shen, Hsin-Ying Lee, Bolei Zhou, Sergey Tulyakov

Existing 3D-aware image synthesis approaches mainly focus on generating a single canonical object and show limited capacity in composing a complex scene containing a variety of objects.

3D-Aware Image Synthesis Object

Paper
Add Code

Training and Tuning Generative Neural Radiance Fields for Attribute-Conditional 3D-Aware Face Generation

1 code implementation • 26 Aug 2022 • Jichao Zhang, Aliaksandr Siarohin, Yahui Liu, Hao Tang, Nicu Sebe, Wei Wang

Generative Neural Radiance Fields (GNeRF) based 3D-aware GANs have demonstrated remarkable capabilities in generating high-quality images while maintaining strong 3D consistency.

Attribute Disentanglement +2

Paper
Code

Playable Environments: Video Manipulation in Space and Time

1 code implementation • CVPR 2022 • Willi Menapace, Stéphane Lathuilière, Aliaksandr Siarohin, Christian Theobalt, Sergey Tulyakov, Vladislav Golyanik, Elisa Ricci

We present Playable Environments - a new representation for interactive video generation and manipulation in space and time.

Video Generation

Paper
Code

3D-Aware Semantic-Guided Generative Model for Human Synthesis

1 code implementation • 2 Dec 2021 • Jichao Zhang, Enver Sangineto, Hao Tang, Aliaksandr Siarohin, Zhun Zhong, Nicu Sebe, Wei Wang

However, they usually struggle to generate high-quality images representing non-rigid objects, such as the human body, which is of a great interest for many computer graphics applications.

3D-Aware Image Synthesis

Paper
Code

Controllable Person Image Synthesis with Spatially-Adaptive Warped Normalization

1 code implementation • 31 May 2021 • Jichao Zhang, Aliaksandr Siarohin, Hao Tang, Enver Sangineto, Wei Wang, Humphrey Sh, Nicu Sebe

Moreover, we propose a novel Self-Training Part Replacement (STPR) strategy to refine the model for the texture-transfer task, which improves the quality of the generated clothes and the preservation ability of non-target regions.

Image-to-Image Translation Pose Transfer +1

Paper
Code

Motion Representations for Articulated Animation

2 code implementations • CVPR 2021 • Aliaksandr Siarohin, Oliver J. Woodford, Jian Ren, Menglei Chai, Sergey Tulyakov

To facilitate animation and prevent the leakage of the shape of the driving object, we disentangle shape and pose of objects in the region space.

Ranked #1 on Video Reconstruction on Tai-Chi-HD (512)

Object Video Reconstruction

14,204

Paper
Code

Playable Video Generation

1 code implementation • CVPR 2021 • Willi Menapace, Stéphane Lathuilière, Sergey Tulyakov, Aliaksandr Siarohin, Elisa Ricci

This paper introduces the unsupervised learning problem of playable video generation (PVG).

Video Generation

150

Paper
Code

Whitening for Self-Supervised Representation Learning

8 code implementations • 13 Jul 2020 • Aleksandr Ermolov, Aliaksandr Siarohin, Enver Sangineto, Nicu Sebe

Most of the current self-supervised representation learning (SSL) methods are based on the contrastive loss and the instance-discrimination task, where augmented versions of the same image instance ("positives") are contrasted with instances extracted from other images ("negatives").

Representation Learning Self-Supervised Learning

1,357

Paper
Code

TriGAN: Image-to-Image Translation for Multi-Source Domain Adaptation

no code implementations • 19 Apr 2020 • Subhankar Roy, Aliaksandr Siarohin, Enver Sangineto, Nicu Sebe, Elisa Ricci

In this paper we propose the first approach for Multi-Source Domain Adaptation (MSDA) based on Generative Adversarial Networks.

Ranked #5 on Multi-Source Unsupervised Domain Adaptation on Office-Caltech10

Domain Adaptation Image-to-Image Translation +2

Paper
Add Code

Motion-supervised Co-Part Segmentation

2 code implementations • 7 Apr 2020 • Aliaksandr Siarohin, Subhankar Roy, Stéphane Lathuilière, Sergey Tulyakov, Elisa Ricci, Nicu Sebe

To overcome this limitation, we propose a self-supervised deep learning method for co-part segmentation.

Ranked #3 on Unsupervised Human Pose Estimation on Tai-Chi-HD

Segmentation Unsupervised Human Pose Estimation

643

Paper
Code

First Order Motion Model for Image Animation

2 code implementations • NeurIPS 2019 • Aliaksandr Siarohin, Stéphane Lathuilière, Sergey Tulyakov, Elisa Ricci, Nicu Sebe

To achieve this, we decouple appearance and motion information using a self-supervised formulation.

Ranked #1 on Video Reconstruction on Tai-Chi-HD

Image Animation Object +1

14,204

Paper
Code

DwNet: Dense warp-based network for pose-guided human video generation

2 code implementations • 21 Oct 2019 • Polina Zablotskaia, Aliaksandr Siarohin, Bo Zhao, Leonid Sigal

In this paper, we focus on human motion transfer - generation of a video depicting a particular subject, observed in a single image, performing a series of motions exemplified by an auxiliary (driving) video.

Video Generation

Paper
Code

Attention-based Fusion for Multi-source Human Image Generation

no code implementations • 7 May 2019 • Stéphane Lathuilière, Enver Sangineto, Aliaksandr Siarohin, Nicu Sebe

We present a generalization of the person-image generation task, in which a human image is generated conditioned on a target pose and a set X of source appearance images.

Image Generation

Paper
Add Code

Whitening and Coloring transform for GANs

no code implementations • ICLR 2019 • Aliaksandr Siarohin, Enver Sangineto, Nicu Sebe

In this paper we propose to generalize both BN and cBN using a Whitening and Coloring based batch normalization.

Paper
Add Code

Appearance and Pose-Conditioned Human Image Generation using Deformable GANs

1 code implementation • 30 Apr 2019 • Aliaksandr Siarohin, Stéphane Lathuilière, Enver Sangineto, Nicu Sebe

Specifically, given an image xa of a person and a target pose P(xb), extracted from a different image xb, we synthesize a new image of that person in pose P(xb), while preserving the visual details in xa.

Data Augmentation Generative Adversarial Network +2

384

Paper
Code

Unsupervised Domain Adaptation using Feature-Whitening and Consensus Loss

1 code implementation • CVPR 2019 • Subhankar Roy, Aliaksandr Siarohin, Enver Sangineto, Samuel Rota Bulo, Nicu Sebe, Elisa Ricci

A classifier trained on a dataset seldom works on other datasets obtained under different conditions due to domain shift.

General Classification Object Recognition +1

Paper
Code

Animating Arbitrary Objects via Deep Motion Transfer

1 code implementation • CVPR 2019 • Aliaksandr Siarohin, Stéphane Lathuilière, Sergey Tulyakov, Elisa Ricci, Nicu Sebe

This is achieved through a deep architecture that decouples appearance and motion information.

Image Animation motion prediction +2

461

Paper
Code

Enhancing Perceptual Attributes with Bayesian Style Generation

1 code implementation • 3 Dec 2018 • Aliaksandr Siarohin, Gloria Zen, Nicu Sebe, Elisa Ricci

Our approach takes as input a natural image and exploits recent models for deep style transfer and generative adversarial networks to change its style in order to modify a specific high-level attribute.

Attribute Style Transfer

Paper
Code

Whitening and Coloring batch transform for GANs

1 code implementation • ICLR 2019 • Aliaksandr Siarohin, Enver Sangineto, Nicu Sebe

In this paper we propose to generalize both BN and cBN using a Whitening and Coloring based batch normalization.

Image Generation

Paper
Code

Deformable GANs for Pose-based Human Image Generation

1 code implementation • CVPR 2018 • Aliaksandr Siarohin, Enver Sangineto, Stephane Lathuiliere, Nicu Sebe

Specifically, given an image of a person and a target pose, we synthesize a new image of that person in the novel pose.

Ranked #5 on Gesture-to-Gesture Translation on NTU Hand Digit

Generative Adversarial Network Gesture-to-Gesture Translation +2

384

Paper
Code

How to Make an Image More Memorable? A Deep Style Transfer Approach

1 code implementation • 6 Apr 2017 • Aliaksandr Siarohin, Gloria Zen, Cveta Majtanovic, Xavier Alameda-Pineda, Elisa Ricci, Nicu Sebe

In this work, we show that it is possible to automatically retrieve the best style seeds for a given image, thus remarkably reducing the number of human attempts needed to find a good match.

Image Generation Style Transfer

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.