Search Results for author: Chaoyue Wang

Found 49 papers, 24 papers with code

Infinite-ID: Identity-preserved Personalization via ID-semantics Decoupling Paradigm

no code implementations • 18 Mar 2024 • Yi Wu, Ziqiang Li, Heliang Zheng, Chaoyue Wang, Bin Li

Drawing on recent advancements in diffusion models for text-to-image generation, identity-preserved personalization has made significant progress in accurately capturing specific identities with just a single reference image.

Text-to-Image Generation

Paper
Add Code

When ControlNet Meets Inexplicit Masks: A Case Study of ControlNet on its Contour-following Ability

no code implementations • 1 Mar 2024 • Wenjie Xuan, Yufei Xu, Shanshan Zhao, Chaoyue Wang, Juhua Liu, Bo Du, DaCheng Tao

Subsequently, to enhance controllability with inexplicit masks, an advanced Shape-aware ControlNet consisting of a deterioration estimator and a shape-prior modulation block is devised.

Paper
Add Code

Trajectory Consistency Distillation: Improved Latent Consistency Distillation by Semi-Linear Consistency Function with Trajectory Mapping

1 code implementation • 29 Feb 2024 • Jianbin Zheng, Minghui Hu, Zhongyi Fan, Chaoyue Wang, Changxing Ding, DaCheng Tao, Tat-Jen Cham

Consequently, we introduce Trajectory Consistency Distillation (TCD), which encompasses trajectory consistency function and strategic stochastic sampling.

Image Generation

236

Paper
Code

HandRefiner: Refining Malformed Hands in Generated Images by Diffusion-based Conditional Inpainting

1 code implementation • 29 Nov 2023 • Wenquan Lu, Yufei Xu, Jing Zhang, Chaoyue Wang, DaCheng Tao

Given a generated failed image due to malformed hands, we utilize ControlNet modules to re-inject such correct hand information.

633

Paper
Code

One More Step: A Versatile Plug-and-Play Module for Rectifying Diffusion Schedule Flaws and Enhancing Low-Frequency Controls

no code implementations • 27 Nov 2023 • Minghui Hu, Jianbin Zheng, Chuanxia Zheng, Chaoyue Wang, DaCheng Tao, Tat-Jen Cham

By integrating a compact network and incorporating an additional simple yet effective step during inference, OMS elevates image fidelity and harmonizes the dichotomy between training and inference, while preserving original model parameters.

Denoising

Paper
Add Code

Peer is Your Pillar: A Data-unbalanced Conditional GANs for Few-shot Image Generation

no code implementations • 14 Nov 2023 • Ziqiang Li, Chaoyue Wang, Xue Rui, Chao Xue, Jiaxu Leng, Bin Li

Few-shot image generation aims to train generative models using a small number of training images.

Image Generation Transfer Learning

Paper
Add Code

Decompose Semantic Shifts for Composed Image Retrieval

no code implementations • 18 Sep 2023 • Xingyu Yang, Daqing Liu, Heng Zhang, Yong Luo, Chaoyue Wang, Jing Zhang

Composed image retrieval is a type of image retrieval task where the user provides a reference image as a starting point and specifies a text on how to shift from the starting point to the desired target image.

Image Retrieval Retrieval

Paper
Add Code

Autoregressive Omni-Aware Outpainting for Open-Vocabulary 360-Degree Image Generation

1 code implementation • 7 Sep 2023 • Zhuqiang Lu, Kun Hu, Chaoyue Wang, Lei Bai, Zhiyong Wang

A 360-degree (omni-directional) image provides an all-encompassing spherical view of a scene.

Image Generation

Paper
Code

PartSeg: Few-shot Part Segmentation via Part-aware Prompt Learning

no code implementations • 24 Aug 2023 • Mengya Han, Heliang Zheng, Chaoyue Wang, Yong Luo, Han Hu, Jing Zhang, Yonggang Wen

In this work, we address the task of few-shot part segmentation, which aims to segment the different parts of an unseen object using very few labeled examples.

Language Modelling Segmentation

Paper
Add Code

Cross-modal & Cross-domain Learning for Unsupervised LiDAR Semantic Segmentation

no code implementations • 5 Aug 2023 • Yiyang Chen, Shanshan Zhao, Changxing Ding, Liyao Tang, Chaoyue Wang, DaCheng Tao

In recent years, cross-modal domain adaptation has been studied on the paired 2D image and 3D LiDAR data to ease the labeling costs for 3D LiDAR semantic segmentation (3DLSS) in the target domain.

Domain Adaptation LIDAR Semantic Segmentation +1

Paper
Add Code

Cocktail: Mixing Multi-Modality Controls for Text-Conditional Image Generation

no code implementations • 1 Jun 2023 • Minghui Hu, Jianbin Zheng, Daqing Liu, Chuanxia Zheng, Chaoyue Wang, DaCheng Tao, Tat-Jen Cham

In this work, we propose Cocktail, a pipeline to mix various modalities into one embedding, amalgamated with a generalized ControlNet (gControlNet), a controllable normalisation (ControlNorm), and a spatial guidance sampling method, to actualize multi-modal and spatially-refined control for text-conditional diffusion models.

Conditional Image Generation

Paper
Add Code

All Points Matter: Entropy-Regularized Distribution Alignment for Weakly-supervised 3D Segmentation

1 code implementation • NeurIPS 2023 • Liyao Tang, Zhe Chen, Shanshan Zhao, Chaoyue Wang, DaCheng Tao

We hypothesize that this selective usage arises from the noise in pseudo-labels generated on unlabeled data.

Pseudo Label Segmentation +1

Paper
Code

Null-text Guidance in Diffusion Models is Secretly a Cartoon-style Creator

no code implementations • 11 May 2023 • Jing Zhao, Heliang Zheng, Chaoyue Wang, Long Lan, Wanrong Huang, Wenjing Yang

Specifically, we proposed two disturbance methods, i. e., Rollback disturbance (Back-D) and Image disturbance (Image-D), to construct misalignment between the noisy images used for predicting null-text guidance and text guidance (subsequently referred to as \textbf{null-text noisy image} and \textbf{text noisy image} respectively) in the sampling process.

Paper
Add Code

MMoT: Mixture-of-Modality-Tokens Transformer for Composed Multimodal Conditional Image Synthesis

no code implementations • 10 May 2023 • Jianbin Zheng, Daqing Liu, Chaoyue Wang, Minghui Hu, Zuopeng Yang, Changxing Ding, DaCheng Tao

To this end, we propose to generate images conditioned on the compositions of multimodal control signals, where modalities are imperfectly complementary, i. e., composed multimodal conditional image synthesis (CMCIS).

Image Generation

Paper
Add Code

MagicFusion: Boosting Text-to-Image Generation Performance by Fusing Diffusion Models

no code implementations • ICCV 2023 • Jing Zhao, Heliang Zheng, Chaoyue Wang, Long Lan, Wenjing Yang

The advent of open-source AI communities has produced a cornucopia of powerful text-guided diffusion models that are trained on various datasets.

Text-to-Image Generation

Paper
Add Code

ESceme: Vision-and-Language Navigation with Episodic Scene Memory

1 code implementation • 2 Mar 2023 • Qi Zheng, Daqing Liu, Chaoyue Wang, Jing Zhang, Dadong Wang, DaCheng Tao

Vision-and-language navigation (VLN) simulates a visual agent that follows natural-language navigation instructions in real-world scenes.

Vision and Language Navigation

Paper
Code

OmniForce: On Human-Centered, Large Model Empowered and Cloud-Edge Collaborative AutoML System

no code implementations • 1 Mar 2023 • Chao Xue, Wei Liu, Shuai Xie, Zhenfang Wang, Jiaxing Li, Xuyang Peng, Liang Ding, Shanshan Zhao, Qiong Cao, Yibo Yang, Fengxiang He, Bohua Cai, Rongcheng Bian, Yiyan Zhao, Heliang Zheng, Xiangyang Liu, Dongkai Liu, Daqing Liu, Li Shen, Chang Li, Shijin Zhang, Yukang Zhang, Guanpu Chen, Shixiang Chen, Yibing Zhan, Jing Zhang, Chaoyue Wang, DaCheng Tao

Automated machine learning (AutoML) seeks to build ML models with minimal human effort.

AutoML

Paper
Add Code

Eliminating Contextual Prior Bias for Semantic Image Editing via Dual-Cycle Diffusion

1 code implementation • 5 Feb 2023 • Zuopeng Yang, Tianshu Chu, Xin Lin, Erdun Gao, Daqing Liu, Jie Yang, Chaoyue Wang

The proposed model incorporates a Bias Elimination Cycle that consists of both a forward path and an inverted path, each featuring a Structural Consistency Cycle to ensure the preservation of image content during the editing process.

Text-to-Image Generation

Paper
Code

Diff-Font: Diffusion Model for Robust One-Shot Font Generation

1 code implementation • 12 Dec 2022 • Haibin He, Xinyuan Chen, Chaoyue Wang, Juhua Liu, Bo Du, DaCheng Tao, Yu Qiao

Specifically, a large stroke-wise dataset is constructed, and a stroke-wise diffusion model is proposed to preserve the structure and the completion of each generated character.

Font Generation

Paper
Code

Unified Discrete Diffusion for Simultaneous Vision-Language Generation

1 code implementation • 27 Nov 2022 • Minghui Hu, Chuanxia Zheng, Heliang Zheng, Tat-Jen Cham, Chaoyue Wang, Zuopeng Yang, DaCheng Tao, Ponnuthurai N. Suganthan

The recently developed discrete diffusion models perform extraordinarily well in the text-to-image task, showing significant promise for handling the multi-modality signals.

multimodal generation Text Generation +1

Paper
Code

3DDesigner: Towards Photorealistic 3D Object Generation and Editing with Text-guided Diffusion Models

no code implementations • 25 Nov 2022 • Gang Li, Heliang Zheng, Chaoyue Wang, Chang Li, Changwen Zheng, DaCheng Tao

Text-guided diffusion models have shown superior performance in image/video generation and editing.

Denoising Novel View Synthesis +1

Paper
Add Code

Cross-Modal Contrastive Learning for Robust Reasoning in VQA

1 code implementation • 21 Nov 2022 • Qi Zheng, Chaoyue Wang, Daqing Liu, Dadong Wang, DaCheng Tao

For each positive pair, we regard the images from different graphs as negative samples and deduct the version of multi-positive contrastive learning.

Contrastive Learning Question Answering +1

Paper
Code

Leveraging GAN Priors for Few-Shot Part Segmentation

1 code implementation • 27 Jul 2022 • Mengya Han, Heliang Zheng, Chaoyue Wang, Yong Luo, Han Hu, Bo Du

Overall, this work is an attempt to explore the internal relevance between generation tasks and perception tasks by prompt designing.

Image Generation Segmentation

Paper
Code

FakeCLR: Exploring Contrastive Learning for Solving Latent Discontinuity in Data-Efficient GANs

1 code implementation • 18 Jul 2022 • Ziqiang Li, Chaoyue Wang, Heliang Zheng, Jing Zhang, Bin Li

Since data augmentation strategies have largely alleviated the training instability, how to further improve the generative performance of DE-GANs becomes a hotspot.

Contrastive Learning Data Augmentation

Paper
Code

SemMAE: Semantic-Guided Masking for Learning Masked Autoencoders

1 code implementation • 21 Jun 2022 • Gang Li, Heliang Zheng, Daqing Liu, Chaoyue Wang, Bing Su, Changwen Zheng

In this paper, we explore a potential visual analogue of words, i. e., semantic parts, and we integrate semantic information into the training process of MAE by proposing a Semantic-Guided Masking strategy.

Language Modelling Masked Language Modeling +1

Paper
Code

Bypass Network for Semantics Driven Image Paragraph Captioning

no code implementations • 21 Jun 2022 • Qi Zheng, Chaoyue Wang, Dadong Wang

Most existing methods model the coherence through the topic transition that dynamically infers a topic vector from preceding sentences.

Image Paragraph Captioning Sentence

Paper
Add Code

Recent Advances for Quantum Neural Networks in Generative Learning

no code implementations • 7 Jun 2022 • Jinkai Tian, Xiaoyu Sun, Yuxuan Du, Shanshan Zhao, Qing Liu, Kaining Zhang, Wei Yi, Wanrong Huang, Chaoyue Wang, Xingyao Wu, Min-Hsiu Hsieh, Tongliang Liu, Wenjing Yang, DaCheng Tao

Due to the intrinsic probabilistic nature of quantum mechanics, it is reasonable to postulate that quantum generative learning models (QGLMs) may surpass their classical counterparts.

BIG-bench Machine Learning Quantum Machine Learning

Paper
Add Code

Modeling Image Composition for Complex Scene Generation

1 code implementation • CVPR 2022 • Zuopeng Yang, Daqing Liu, Chaoyue Wang, Jie Yang, DaCheng Tao

Compared to existing CNN-based and Transformer-based generation models that entangled modeling on pixel-level&patch-level and object-level&patch-level respectively, the proposed focal attention predicts the current patch token by only focusing on its highly-related tokens that specified by the spatial layout, thereby achieving disambiguation during training.

Layout-to-Image Generation Object +1

Paper
Code

Visual Superordinate Abstraction for Robust Concept Learning

no code implementations • 28 May 2022 • Qi Zheng, Chaoyue Wang, Dadong Wang, DaCheng Tao

Concept learning constructs visual representations that are connected to linguistic semantics, which is fundamental to vision-language tasks.

Attribute Question Answering +1

Paper
Add Code

Neural Maximum A Posteriori Estimation on Unpaired Data for Motion Deblurring

1 code implementation • 26 Apr 2022 • Youjian Zhang, Chaoyue Wang, DaCheng Tao

The proposed NeurMAP is an orthogonal approach to existing deblurring neural networks, and is the first framework that enables training image deblurring networks on unpaired datasets.

Deblurring Image Deblurring +1

Paper
Code

A Comprehensive Survey on Data-Efficient GANs in Image Generation

no code implementations • 18 Apr 2022 • Ziqiang Li, Beihao Xia, Jing Zhang, Chaoyue Wang, Bin Li

Generative Adversarial Networks (GANs) have achieved remarkable achievements in image synthesis.

Image Generation

Paper
Add Code

BatchFormerV2: Exploring Sample Relationships for Dense Representation Learning

1 code implementation • 4 Apr 2022 • Zhi Hou, Baosheng Yu, Chaoyue Wang, Yibing Zhan, DaCheng Tao

Specifically, when applying the proposed module, it employs a two-stream pipeline during training, i. e., either with or without a BatchFormerV2 module, where the batchformer stream can be removed for testing.

Image Classification object-detection +3

236

Paper
Code

Self-Augmented Unpaired Image Dehazing via Density and Depth Decomposition

1 code implementation • CVPR 2022 • Yang Yang, Chaoyue Wang, Risheng Liu, Lin Zhang, Xiaojie Guo, DaCheng Tao

With estimated scene depth, our method is capable of re-rendering hazy images with different thicknesses which further benefits the training of the dehazing network.

Image Dehazing

Paper
Code

Visual Semantics Allow for Textual Reasoning Better in Scene Text Recognition

1 code implementation • AAAI 2022 2021 • Yue He, Chen Chen, Jing Zhang, Juhua Liu, Fengxiang He, Chaoyue Wang, Bo Du

Technically, given the character segmentation maps predicted by a VR model, we construct a subgraph for each instance, where nodes represent the pixels in it and edges are added between nodes based on their spatial similarity.

Ranked #9 on Scene Text Recognition on ICDAR2015 (using extra training data)

Language Modelling Scene Text Recognition

105

Paper
Code

Video Frame Interpolation without Temporal Priors

1 code implementation • NeurIPS 2020 • Youjian Zhang, Chaoyue Wang, DaCheng Tao

However, in complicated real-world situations, the temporal priors of videos, i. e. frames per second (FPS) and frame exposure time, may vary from different camera sensors.

Optical Flow Estimation Video Frame Interpolation

Paper
Code

TAG: Toward Accurate Social Media Content Tagging with a Concept Graph

no code implementations • 13 Oct 2021 • Jiuding Yang, Weidong Guo, Bang Liu, Yakun Yu, Chaoyue Wang, Jinwen Luo, Linglong Kong, Di Niu, Zhen Wen

Although conceptualization has been widely studied in semantics and knowledge representation, it is still challenging to find the most accurate concept phrases to characterize the main idea of a text snippet on the fast-growing social media.

Dependency Parsing Graph Matching +4

Paper
Add Code

MRI-based Alzheimer's disease prediction via distilling the knowledge in multi-modal data

no code implementations • 8 Apr 2021 • Hao Guan, Chaoyue Wang, DaCheng Tao

In this work, we propose a multi-modal multi-instance distillation scheme, which aims to distill the knowledge learned from multi-modal data to an MRI-based network for MCI conversion prediction.

Disease Prediction

Paper
Add Code

Experimental Quantum Generative Adversarial Networks for Image Generation

2 code implementations • 13 Oct 2020 • He-Liang Huang, Yuxuan Du, Ming Gong, YouWei Zhao, Yulin Wu, Chaoyue Wang, Shaowei Li, Futian Liang, Jin Lin, Yu Xu, Rui Yang, Tongliang Liu, Min-Hsiu Hsieh, Hui Deng, Hao Rong, Cheng-Zhi Peng, Chao-Yang Lu, Yu-Ao Chen, DaCheng Tao, Xiaobo Zhu, Jian-Wei Pan

For the first time, we experimentally achieve the learning and generation of real-world hand-written digit images on a superconducting quantum processor.

Image Generation Quantum Machine Learning

Paper
Code

Exposure Trajectory Recovery from Motion Blur

1 code implementation • 6 Oct 2020 • Youjian Zhang, Chaoyue Wang, Stephen J. Maybank, DaCheng Tao

However, the motion information contained in a blurry image has yet to be fully explored and accurately formulated because: (i) the ground truth of dynamic motion is difficult to obtain; (ii) the temporal ordering is destroyed during the exposure; and (iii) the motion estimation from a blurry image is highly ill-posed.

Deblurring Image Deblurring +1

Paper
Code

A Systematic Survey of Regularization and Normalization in GANs

1 code implementation • 19 Aug 2020 • Ziqiang Li, Muhammad Usman, Rentuo Tao, Pengfei Xia, Chaoyue Wang, Huanhuan Chen, Bin Li

Although a handful number of regularization and normalization methods have been proposed for GANs, to the best of our knowledge, there exists no comprehensive survey that primarily focuses on objectives and development of these methods, apart from some in-comprehensive and limited scope studies.

Data Augmentation

Paper
Code

GIANT: Scalable Creation of a Web-scale Ontology

1 code implementation • 5 Apr 2020 • Bang Liu, Weidong Guo, Di Niu, Jinwen Luo, Chaoyue Wang, Zhen Wen, Yu Xu

These services will benefit from a highly structured and web-scale ontology of entities, concepts, events, topics and categories.

News Recommendation

Paper
Code

A User-Centered Concept Mining System for Query and Document Understanding at Tencent

no code implementations • 21 May 2019 • Bang Liu, Weidong Guo, Di Niu, Chaoyue Wang, Shunnan Xu, Jinghong Lin, Kunfeng Lai, Yu Xu

We further present our techniques to tag documents with user-centered concepts and to construct a topic-concept-instance taxonomy, which has helped to improve search as well as news feeds recommendation in Tencent QQ Browser.

document understanding TAG

Paper
Add Code

Multiple Sclerosis Lesion Inpainting Using Non-Local Partial Convolutions

no code implementations • 24 Dec 2018 • Hao Xiong, Chaoyue Wang, DaCheng Tao, Michael Barnett, Chenyu Wang

However, existing methods inpaint lesions based on texture information derived from local surrounding tissue, often leading to inconsistent inpainting and the generation of artifacts such as intensity discrepancy and blurriness.

Paper
Add Code

Evolutionary Generative Adversarial Networks

3 code implementations • 1 Mar 2018 • Chaoyue Wang, Chang Xu, Xin Yao, DaCheng Tao

In this paper, we propose a novel GAN framework called evolutionary generative adversarial networks (E-GAN) for stable GAN training and improved generative performance.

Paper
Code

Perceptual Adversarial Networks for Image-to-Image Transformation

2 code implementations • 28 Jun 2017 • Chaoyue Wang, Chang Xu, Chaohui Wang, DaCheng Tao

The proposed PAN consists of two feed-forward convolutional neural networks (CNNs), the image transformation network T and the discriminative network D. Through combining the generative adversarial loss and the proposed perceptual adversarial loss, these two networks can be trained alternately to solve image-to-image transformation tasks.

Image Inpainting

248

Paper
Code

Tag Disentangled Generative Adversarial Networks for Object ImageRe-rendering

no code implementations • International Joint Conference on Artificial Intelligence 2017 • Chaoyue Wang, Chaohui Wang, Chang Xu, DaCheng Tao

The whole framework consists of a disentangling network, a generative network, a tag mapping net, and a discriminative network, which are trained jointly based on a given set of images that are complete/partially tagged(i. e., supervised/semi-supervised setting).

Object TAG