Search Results for author: Sen He

Found 21 papers, 8 papers with code

Hyper-VolTran: Fast and Generalizable One-Shot Image to 3D Object Structure via HyperNetworks

no code implementations • 24 Dec 2023 • Christian Simon, Sen He, Juan-Manuel Perez-Rua, Mengmeng Xu, Amine Benhalloum, Tao Xiang

Solving image-to-3D from a single view is an ill-posed problem, and current neural reconstruction methods addressing it through diffusion models still rely on scene-specific optimization, constraining their generalization capability.

Image to 3D Neural Rendering

Paper
Add Code

GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation

no code implementations • 7 Dec 2023 • Shoufa Chen, Mengmeng Xu, Jiawei Ren, Yuren Cong, Sen He, Yanping Xie, Animesh Sinha, Ping Luo, Tao Xiang, Juan-Manuel Perez-Rua

In this study, we explore Transformer-based diffusion models for image and video generation.

Text-to-Video Generation Video Generation

Paper
Add Code

FLATTEN: optical FLow-guided ATTENtion for consistent text-to-video editing

no code implementations • 9 Oct 2023 • Yuren Cong, Mengmeng Xu, Christian Simon, Shoufa Chen, Jiawei Ren, Yanping Xie, Juan-Manuel Perez-Rua, Bodo Rosenhahn, Tao Xiang, Sen He

In this paper, for the first time, we introduce optical flow into the attention module in the diffusion model's U-Net to address the inconsistency issue for text-to-video editing.

Optical Flow Estimation Text-to-Video Editing +1

Paper
Add Code

Learning Garment DensePose for Robust Warping in Virtual Try-On

1 code implementation • 30 Mar 2023 • Aiyu Cui, Sen He, Tao Xiang, Antoine Toisoul

In this work, we propose a robust warping method for virtual try-on based on a learned garment DensePose which has a direct correspondence with the person's DensePose.

Virtual Try-on

Paper
Code

Diffused Heads: Diffusion Models Beat GANs on Talking-Face Generation

no code implementations • 6 Jan 2023 • Michał Stypułkowski, Konstantinos Vougioukas, Sen He, Maciej Zięba, Stavros Petridis, Maja Pantic

Talking face generation has historically struggled to produce head movements and natural facial expressions without guidance from additional reference videos.

Talking Face Generation Video Generation

Paper
Add Code

Single Stage Multi-Pose Virtual Try-On

no code implementations • 19 Nov 2022 • Sen He, Yi-Zhe Song, Tao Xiang

Key to our model is a parallel flow estimation module that predicts the flow fields for both person and garment images conditioned on the target pose.

Pose Transfer Virtual Try-on

Paper
Add Code

Prediction Calibration for Generalized Few-shot Semantic Segmentation

no code implementations • 15 Oct 2022 • Zhihe Lu, Sen He, Da Li, Yi-Zhe Song, Tao Xiang

To ensure that the fused scores are not biased to either the base or novel classes, a new Transformer-based calibration module is introduced.

Ranked #3 on Generalized Few-Shot Semantic Segmentation on PASCAL-5i (5-Shot)

Generalized Few-Shot Semantic Segmentation Semantic Segmentation

Paper
Add Code

UIGR: Unified Interactive Garment Retrieval

1 code implementation • 6 Apr 2022 • Xiao Han, Sen He, Li Zhang, Yi-Zhe Song, Tao Xiang

In this paper, we propose a Unified Interactive Garment Retrieval (UIGR) framework to unify TGR and VCR.

Retrieval

Paper
Code

Style-Based Global Appearance Flow for Virtual Try-On

3 code implementations • CVPR 2022 • Sen He, Yi-Zhe Song, Tao Xiang

To achieve this, a key step is garment warping which spatially aligns the target garment with the corresponding body parts in the person image.

Ranked #1 on Virtual Try-on on VITON

Virtual Try-on

268

Paper
Code

Hybrid Graph Neural Networks for Few-Shot Learning

no code implementations • 13 Dec 2021 • Tianyuan Yu, Sen He, Yi-Zhe Song, Tao Xiang

This is because they use an instance GNN as a label propagation/classification module, which is jointly meta-learned with a feature embedding network.

Few-Shot Learning

Paper
Add Code

Text-Based Person Search with Limited Data

1 code implementation • 20 Oct 2021 • Xiao Han, Sen He, Li Zhang, Tao Xiang

Firstly, to fully utilize the existing small-scale benchmarking datasets for more discriminative feature learning, we introduce a cross-modal momentum contrastive learning framework to enrich the training data for a given mini-batch.

Ranked #10 on Text based Person Retrieval on CUHK-PEDES (using extra training data)

Benchmarking Contrastive Learning +7

Paper
Code

Simpler is Better: Few-shot Semantic Segmentation with Classifier Weight Transformer

1 code implementation • ICCV 2021 • Zhihe Lu, Sen He, Xiatian Zhu, Li Zhang, Yi-Zhe Song, Tao Xiang

A few-shot semantic segmentation model is typically composed of a CNN encoder, a CNN decoder and a simple classifier (separating foreground and background pixels).

Ranked #9 on Few-Shot Semantic Segmentation on COCO-20i -> Pascal VOC (5-shot)

Few-Shot Semantic Segmentation Meta-Learning +1

126

Paper
Code

Disentangled Lifespan Face Synthesis

no code implementations • ICCV 2021 • Sen He, Wentong Liao, Michael Ying Yang, Yi-Zhe Song, Bodo Rosenhahn, Tao Xiang

The generated face image given a target age code is expected to be age-sensitive reflected by bio-plausible transformations of shape and texture, while being identity preserving.

Face Generation

Paper
Add Code

Context-Aware Layout to Image Generation with Enhanced Object Appearance

1 code implementation • CVPR 2021 • Sen He, Wentong Liao, Michael Ying Yang, Yongxin Yang, Yi-Zhe Song, Bodo Rosenhahn, Tao Xiang

We argue that these are caused by the lack of context-aware object and stuff feature encoding in their generators, and location-sensitive appearance representation in their discriminators.

Ranked #1 on Layout-to-Image Generation on COCO-Stuff 128x128

Layout-to-Image Generation Object

Paper
Code

Image Captioning through Image Transformer

2 code implementations • 29 Apr 2020 • Sen He, Wentong Liao, Hamed R. -Tavakoli, Michael Yang, Bodo Rosenhahn, Nicolas Pugeault

Inspired by the successes in text analysis and translation, previous work have proposed the \textit{transformer} architecture for image captioning.

Image Captioning object-detection +3

Paper
Code

Understanding and Visualizing Deep Visual Saliency Models

1 code implementation • CVPR 2019 • Sen He, Hamed R. -Tavakoli, Ali Borji, Yang Mi, Nicolas Pugeault

Our analyses reveal that: 1) some visual regions (e. g. head, text, symbol, vehicle) are already encoded within various layers of the network pre-trained for object recognition, 2) using modern datasets, we find that fine-tuning pre-trained models for saliency prediction makes them favor some categories (e. g. head) over some others (e. g. text), 3) although deep models of saliency outperform classical models on natural images, the converse is true for synthetic stimuli (e. g. pop-out search arrays), an evidence of significant difference between human and data-driven saliency models, and 4) we confirm that, after-fine tuning, the change in inner-representations is mostly due to the task and not the domain shift in the data.

Object Recognition Saliency Prediction +1

Paper
Code

Human Attention in Image Captioning: Dataset and Analysis

no code implementations • ICCV 2019 • Sen He, Hamed R. -Tavakoli, Ali Borji, Nicolas Pugeault

In this work, we present a novel dataset consisting of eye movements and verbal descriptions recorded synchronously over images.

Image Captioning Sentence +1

Paper
Add Code

Salient Region Segmentation

no code implementations • 15 Mar 2018 • Sen He, Nicolas Pugeault

Early saliency models were based on low-level hand-crafted feature derived from insights gained in neuroscience and psychophysics.

Gaze Prediction regression +2

Paper
Add Code

Aggregated Sparse Attention for Steering Angle Prediction

no code implementations • 15 Mar 2018 • Sen He, Dmitry Kangin, Yang Mi, Nicolas Pugeault

In this paper, we apply the attention mechanism to autonomous driving for steering angle prediction.

Autonomous Driving

Paper
Add Code

What Catches the Eye? Visualizing and Understanding Deep Saliency Models

no code implementations • 15 Mar 2018 • Sen He, Ali Borji, Yang Mi, Nicolas Pugeault

Deep convolutional neural networks have demonstrated high performances for fixation prediction in recent years.

Paper
Add Code

Deep saliency: What is learnt by a deep network about saliency?

no code implementations • 12 Jan 2018 • Sen He, Nicolas Pugeault

Moreover we argue that this transformation leads to the emergence of receptive fields conceptually similar to the centre-surround filters hypothesized by early research on visual saliency.

Saliency Detection

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.