Search Results for author: Jinbin Bai

Found 10 papers, 7 papers with code

An Item is Worth a Prompt: Versatile Image Editing with Disentangled Control

1 code implementation • 7 Mar 2024 • Aosong Feng, Weikang Qiu, Jinbin Bai, Kaicheng Zhou, Zhen Dong, Xiao Zhang, Rex Ying, Leandros Tassiulas

Building on the success of text-to-image diffusion models (DPMs), image editing is an important application to enable human interaction with AI-generated content.

Descriptive

Paper
Code

CVPR 2023 Text Guided Video Editing Competition

1 code implementation • 24 Oct 2023 • Jay Zhangjie Wu, Xiuyu Li, Difei Gao, Zhen Dong, Jinbin Bai, Aishani Singh, Xiaoyu Xiang, Youzeng Li, Zuwei Huang, Yuanxi Sun, Rui He, Feng Hu, Junhua Hu, Hai Huang, Hanyu Zhu, Xu Cheng, Jie Tang, Mike Zheng Shou, Kurt Keutzer, Forrest Iandola

In this paper we present a retrospective on the competition and describe the winning method.

Video Editing Video Generation

Paper
Code

Integrating View Conditions for Image Synthesis

1 code implementation • 24 Oct 2023 • Jinbin Bai, Zhen Dong, Aosong Feng, Xiao Zhang, Tian Ye, Kaicheng Zhou, Mike Zheng Shou

In the field of image processing, applying intricate semantic modifications within existing images remains an enduring challenge.

Image Generation Object

Paper
Code

Sparse Sampling Transformer with Uncertainty-Driven Ranking for Unified Removal of Raindrops and Rain Streaks

1 code implementation • ICCV 2023 • Sixiang Chen, Tian Ye, Jinbin Bai, ErKang Chen, Jun Shi, Lei Zhu

In the real world, image degradations caused by rain often exhibit a combination of rain streaks and raindrops, thereby increasing the challenges of recovering the underlying clean image.

Rain Removal

124

Paper
Code

Taming Diffusion Models for Music-driven Conducting Motion Generation

1 code implementation • 15 Jun 2023 • Zhuoran Zhao, Jinbin Bai, Delong Chen, Debang Wang, Yubo Pan

Generating the motion of orchestral conductors from a given piece of symphony music is a challenging task since it requires a model to learn semantic music features and capture the underlying distribution of real conducting motion.

Paper
Code

Five A$^{+}$ Network: You Only Need 9K Parameters for Underwater Image Enhancement

1 code implementation • 15 May 2023 • Jingxia Jiang, Tian Ye, Jinbin Bai, Sixiang Chen, Wenhao Chai, Shi Jun, Yun Liu, ErKang Chen

In this work, we propose the Five A$^{+}$ Network (FA$^{+}$Net), a highly efficient and lightweight real-time underwater image enhancement network with only $\sim$ 9k parameters and $\sim$ 0. 01s processing time.

Computational Efficiency Image Enhancement

Paper
Code

RSFDM-Net: Real-time Spatial and Frequency Domains Modulation Network for Underwater Image Enhancement

no code implementations • 23 Feb 2023 • Jingxia Jiang, Jinbin Bai, Yun Liu, Junjie Yin, Sixiang Chen, Tian Ye, ErKang Chen

Underwater images typically experience mixed degradations of brightness and structure caused by the absorption and scattering of light by suspended particles.

Image Enhancement

Paper
Add Code

Translating Natural Language to Planning Goals with Large-Language Models

1 code implementation • 10 Feb 2023 • Yaqi Xie, Chen Yu, Tongyao Zhu, Jinbin Bai, Ze Gong, Harold Soh

Recent large language models (LLMs) have demonstrated remarkable performance on a variety of natural language processing (NLP) tasks, leading to intense excitement about their applicability across various domains.

Translation

Paper
Code

Adverse Weather Removal with Codebook Priors

no code implementations • ICCV 2023 • Tian Ye, Sixiang Chen, Jinbin Bai, Jun Shi, Chenghao Xue, Jingxia Jiang, Junjie Yin, ErKang Chen, Yun Liu

Inspired by recent advancements in codebook and vector quantization (VQ) techniques, we present a novel Adverse Weather Removal network with Codebook Priors (AWRCP) to address the problem of unified adverse weather removal.

Quantization

Paper
Add Code

LaT: Latent Translation with Cycle-Consistency for Video-Text Retrieval

no code implementations • 11 Jul 2022 • Jinbin Bai, Chunhui Liu, Feiyue Ni, Haofan Wang, Mengying Hu, Xiaofeng Guo, Lele Cheng

To overcome the above issue, we present a novel mechanism for learning the translation relationship from a source modality space $\mathcal{S}$ to a target modality space $\mathcal{T}$ without the need for a joint latent space, which bridges the gap between visual and textual domains.

Ranked #11 on Zero-Shot Video Retrieval on MSVD

Representation Learning Retrieval +4

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.