Search Results for author: Fengyu Yang

Found 15 papers, 8 papers with code

Tactile-Augmented Radiance Fields

1 code implementation • 7 May 2024 • Yiming Dou, Fengyu Yang, Yi Liu, Antonio Loquercio, Andrew Owens

Our approach makes use of two insights: (i) common vision-based touch sensors are built on ordinary cameras and thus can be registered to images using methods from multi-view geometry, and (ii) visually and structurally similar regions of a scene share the same tactile features.

Paper
Code

WorDepth: Variational Language Prior for Monocular Depth Estimation

1 code implementation • 4 Apr 2024 • Ziyao Zeng, Daniel Wang, Fengyu Yang, Hyoungseob Park, Yangchao Wu, Stefano Soatto, Byung-Woo Hong, Dong Lao, Alex Wong

To test this, we focus on monocular depth estimation, the problem of predicting a dense depth map from a single image, but with an additional text caption describing the scene.

3D Reconstruction Monocular Depth Estimation

Paper
Code

APISR: Anime Production Inspired Real-World Anime Super-Resolution

1 code implementation • 3 Mar 2024 • Boyang Wang, Fengyu Yang, Xihang Yu, Chao Zhang, Hanbin Zhao

In addition, we identify two anime-specific challenges of distorted and faint hand-drawn lines and unwanted color artifacts.

Super-Resolution

534

Paper
Code

Binding Touch to Everything: Learning Unified Multimodal Tactile Representations

no code implementations • 31 Jan 2024 • Fengyu Yang, Chao Feng, Ziyang Chen, Hyoungseob Park, Daniel Wang, Yiming Dou, Ziyao Zeng, Xien Chen, Rit Gangopadhyay, Andrew Owens, Alex Wong

We introduce UniTouch, a unified tactile model for vision-based touch sensors connected to multiple modalities, including vision, language, and sound.

Question Answering Visual Question Answering (VQA)

Paper
Add Code

VCISR: Blind Single Image Super-Resolution with Video Compression Synthetic Data

1 code implementation • 2 Nov 2023 • Boyang Wang, Bowen Liu, Shiyu Liu, Fengyu Yang

In this work, we for the first time, present a video compression-based degradation model to synthesize low-resolution image data in the blind SISR task.

Image Compression Image Super-Resolution +3

Paper
Code

Generating Visual Scenes from Touch

no code implementations • ICCV 2023 • Fengyu Yang, Jiacheng Zhang, Andrew Owens

An emerging line of work has sought to generate plausible imagery from touch.

Paper
Add Code

FreeMan: Towards Benchmarking 3D Human Pose Estimation under Real-World Conditions

1 code implementation • 10 Sep 2023 • Jiong Wang, Fengyu Yang, Wenbo Gou, Bingliang Li, Danqi Yan, Ailing Zeng, Yijun Gao, Junle Wang, Yanqing Jing, Ruimao Zhang

To facilitate the development of 3D pose estimation, we present FreeMan, the first large-scale, multi-view dataset collected under the real-world conditions.

3D Human Pose Estimation 3D Pose Estimation +1

Paper
Code

Boosting Detection in Crowd Analysis via Underutilized Output Features

1 code implementation • CVPR 2023 • Shaokai Wu, Fengyu Yang

Detection-based methods have been viewed unfavorably in crowd analysis due to their poor performance in dense crowds.

Crowd Counting

Paper
Code

Dance with You: The Diversity Controllable Dancer Generation via Diffusion Models

1 code implementation • 23 Aug 2023 • Siyue Yao, MingJie Sun, Bingliang Li, Fengyu Yang, Junle Wang, Ruimao Zhang

In this paper, we introduce a novel multi-dancer synthesis task called partner dancer generation, which involves synthesizing virtual human dancers capable of performing dance with users.

Paper
Code

Boosting Human-Object Interaction Detection with Text-to-Image Diffusion Model

1 code implementation • 20 May 2023 • Jie Yang, Bingliang Li, Fengyu Yang, Ailing Zeng, Lei Zhang, Ruimao Zhang

Extensive experiments demonstrate that DiffHOI significantly outperforms the state-of-the-art in regular detection (i. e., 41. 50 mAP) and zero-shot detection.

Ranked #2 on Zero-Shot Human-Object Interaction Detection on HICO-DET (using extra training data)

Human-Object Interaction Detection Zero-Shot Human-Object Interaction Detection

Paper
Code

Improve Bilingual TTS Using Dynamic Language and Phonology Embedding

no code implementations • 7 Dec 2022 • Fengyu Yang, Jian Luan, Yujun Wang

We introduce phonology embedding to capture the English differences between different phonology.

Paper
Add Code

Touch and Go: Learning from Human-Collected Vision and Touch

no code implementations • 22 Nov 2022 • Fengyu Yang, Chenyang Ma, Jiacheng Zhang, Jing Zhu, Wenzhen Yuan, Andrew Owens

The ability to associate touch with sight is essential for tasks that require physically interacting with objects in the world.

Image Stylization

Paper
Add Code

RBC: Rectifying the Biased Context in Continual Semantic Segmentation

no code implementations • 16 Mar 2022 • Hanbin Zhao, Fengyu Yang, Xinghe Fu, Xi Li

In practice, new images are usually made available in a consecutive manner, leading to a problem called Continual Semantic Segmentation (CSS).

Continual Semantic Segmentation Segmentation +1

Paper
Add Code

Sparse and Complete Latent Organization for Geospatial Semantic Segmentation

no code implementations • CVPR 2022 • Fengyu Yang, Chenyang Ma

In particular, to enhance the sparsity of the latent space, we design a prototypical contrastive learning to have prototypes of the same category clustering together and prototypes of different categories to be far away from each other.

Contrastive Learning Semantic Segmentation

Paper
Add Code

Enriching Source Style Transfer in Recognition-Synthesis based Non-Parallel Voice Conversion

no code implementations • 16 Jun 2021 • Zhichao Wang, Xinyong Zhou, Fengyu Yang, Tao Li, Hongqiang Du, Lei Xie, Wendong Gan, Haitao Chen, Hai Li

Specifically, prosodic features are used to explicit model prosody, while VAE and reference encoder are used to implicitly model prosody, which take Mel spectrum and bottleneck feature as input respectively.

Style Transfer Voice Conversion

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.