Search Results for author: Yujiu Yang

Found 111 papers, 66 papers with code

MIRTT: Learning Multimodal Interaction Representations from Trilinear Transformers for Visual Question Answering

1 code implementation Findings (EMNLP) 2021 Junjie Wang, Yatai Ji, Jiaqi Sun, Yujiu Yang, Tetsuya Sakai

On the other hand, trilinear models such as the CTI model efficiently utilize the inter-modality information between answers, questions, and images, while ignoring intra-modality information.

Multiple-choice Question Answering +1

Sparse Adversarial Attack via Perturbation Factorization

1 code implementation ECCV 2020 Yanbo Fan, Baoyuan Wu, Tuanhui Li, Yong Zhang, Mingyang Li, Zhifeng Li, Yujiu Yang

Based on this factorization, we formulate the sparse attack problem as a mixed integer programming (MIP) to jointly optimize the binary selection factors and continuous perturbation magnitudes of all pixels, with a cardinality constraint on selection factors to explicitly control the degree of sparsity.

Adversarial Attack

Taming Lookup Tables for Efficient Image Retouching

1 code implementation28 Mar 2024 Sidi Yang, Binxiao Huang, Mingdeng Cao, Yatai Ji, Hanzhong Guo, Ngai Wong, Yujiu Yang

Existing enhancement models often optimize for high performance while falling short of reducing hardware inference time and power consumption, especially on edge devices with constrained computing and storage resources.

A Comprehensive Study of Multimodal Large Language Models for Image Quality Assessment

1 code implementation16 Mar 2024 Tianhe Wu, Kede Ma, Jie Liang, Yujiu Yang, Lei Zhang

While Multimodal Large Language Models (MLLMs) have experienced significant advancement on visual understanding and reasoning, their potentials to serve as powerful, flexible, interpretable, and text-driven models for Image Quality Assessment (IQA) remains largely unexplored.

Image Quality Assessment

Mitigating Reversal Curse in Large Language Models via Semantic-aware Permutation Training

no code implementations1 Mar 2024 Qingyan Guo, Rui Wang, Junliang Guo, Xu Tan, Jiang Bian, Yujiu Yang

Accordingly, permutation on the training data is considered as a potential solution, since this can make the model predict antecedent words or tokens.

Language Modelling

CriticBench: Benchmarking LLMs for Critique-Correct Reasoning

1 code implementation22 Feb 2024 Zicheng Lin, Zhibin Gou, Tian Liang, Ruilin Luo, Haowei Liu, Yujiu Yang

Utilizing CriticBench, we evaluate and dissect the performance of 17 LLMs in generation, critique, and correction reasoning, i. e., GQC reasoning.

Benchmarking

RealCompo: Dynamic Equilibrium between Realism and Compositionality Improves Text-to-Image Diffusion Models

1 code implementation20 Feb 2024 Xinchen Zhang, Ling Yang, Yaqi Cai, Zhaochen Yu, Jiake Xie, Ye Tian, Minkai Xu, Yong Tang, Yujiu Yang, Bin Cui

In this paper, we propose a new training-free and transferred-friendly text-to-image generation framework, namely RealCompo, which aims to leverage the advantages of text-to-image and layout-to-image models to enhance both realism and compositionality of the generated images.

Denoising Text-to-Image Generation

SciAgent: Tool-augmented Language Models for Scientific Reasoning

no code implementations18 Feb 2024 Yubo Ma, Zhibin Gou, Junheng Hao, Ruochen Xu, Shuohang Wang, Liangming Pan, Yujiu Yang, Yixin Cao, Aixin Sun, Hany Awadalla, Weizhu Chen

To make this task more practical and solvable for LLMs, we introduce a new task setting named tool-augmented scientific reasoning.

LiFi: Lightweight Controlled Text Generation with Fine-Grained Control Codes

no code implementations10 Feb 2024 Chufan Shi, Deng Cai, Yujiu Yang

In the rapidly evolving field of text generation, the demand for more precise control mechanisms has become increasingly apparent.

Attribute Language Modelling +1

A Thorough Examination of Decoding Methods in the Era of LLMs

no code implementations10 Feb 2024 Chufan Shi, Haoran Yang, Deng Cai, Zhisong Zhang, Yifan Wang, Yujiu Yang, Wai Lam

Decoding methods play an indispensable role in converting language models from next-token predictors into practical task solvers.

Quantization

AgentBoard: An Analytical Evaluation Board of Multi-turn LLM Agents

2 code implementations24 Jan 2024 Chang Ma, Junlei Zhang, Zhihao Zhu, Cheng Yang, Yujiu Yang, Yaohui Jin, Zhenzhong Lan, Lingpeng Kong, Junxian He

Evaluating large language models (LLMs) as general-purpose agents is essential for understanding their capabilities and facilitating their integration into practical applications.

Benchmarking

Deep Evolutional Instant Interest Network for CTR Prediction in Trigger-Induced Recommendation

no code implementations15 Jan 2024 Zhibo Xiao, Luwei Yang, Tao Zhang, Wen Jiang, Wei Ning, Yujiu Yang

Recently, a new recommendation scenario, called Trigger-Induced Recommendation (TIR), where users are able to explicitly express their instant interests via trigger items, is emerging as an essential role in many e-commerce platforms, e. g., Alibaba. com and Amazon.

Click-Through Rate Prediction

Chain of History: Learning and Forecasting with LLMs for Temporal Knowledge Graph Completion

no code implementations11 Jan 2024 Ruilin Luo, Tianle Gu, Haoling Li, Junzhe Li, Zicheng Lin, Jiayi Li, Yujiu Yang

Temporal Knowledge Graph Completion (TKGC) is a complex task involving the prediction of missing event links at future timestamps by leveraging established temporal structural knowledge.

Data Augmentation Link Prediction +1

1st Place Solution for 5th LSVOS Challenge: Referring Video Object Segmentation

1 code implementation1 Jan 2024 Zhuoyan Luo, Yicheng Xiao, Yong liu, Yitong Wang, Yansong Tang, Xiu Li, Yujiu Yang

The recent transformer-based models have dominated the Referring Video Object Segmentation (RVOS) task due to the superior performance.

Object Referring Video Object Segmentation +3

StyleCrafter: Enhancing Stylized Text-to-Video Generation with Style Adapter

2 code implementations1 Dec 2023 Gongye Liu, Menghan Xia, Yong Zhang, Haoxin Chen, Jinbo Xing, Xintao Wang, Yujiu Yang, Ying Shan

To address these challenges, we introduce StyleCrafter, a generic method that enhances pre-trained T2V models with a style control adapter, enabling video generation in any style by providing a reference image.

Disentanglement Text-to-Video Generation +1

CoSeR: Bridging Image and Language for Cognitive Super-Resolution

1 code implementation27 Nov 2023 Haoze Sun, Wenbo Li, Jianzhuang Liu, Haoyu Chen, Renjing Pei, Xueyi Zou, Youliang Yan, Yujiu Yang

We achieve this by marrying image appearance and language understanding to generate a cognitive embedding, which not only activates prior information from large text-to-image diffusion models but also facilitates the generation of high-quality reference images to optimize the SR process.

Super-Resolution

Hint-enhanced In-Context Learning wakes Large Language Models up for knowledge-intensive tasks

no code implementations3 Nov 2023 Yifan Wang, Qingyan Guo, Xinzhe Ni, Chufan Shi, Lemao Liu, Haiyun Jiang, Yujiu Yang

In-context learning (ICL) ability has emerged with the increasing scale of large language models (LLMs), enabling them to learn input-label mappings from demonstrations and perform well on downstream tasks.

In-Context Learning Open-Domain Question Answering

Leveraging Word Guessing Games to Assess the Intelligence of Large Language Models

1 code implementation31 Oct 2023 Tian Liang, Zhiwei He, Jen-tse Huang, Wenxuan Wang, Wenxiang Jiao, Rui Wang, Yujiu Yang, Zhaopeng Tu, Shuming Shi, Xing Wang

Ideally, an advanced agent should possess the ability to accurately describe a given word using an aggressive description while concurrently maximizing confusion in the conservative description, enhancing its participation in the game.

Specialist or Generalist? Instruction Tuning for Specific NLP Tasks

no code implementations23 Oct 2023 Chufan Shi, Yixuan Su, Cheng Yang, Yujiu Yang, Deng Cai

Although instruction tuning has proven to be a data-efficient method for transforming LLMs into such generalist models, their performance still lags behind specialist models trained exclusively for specific tasks.

Specificity

Continuous Invariance Learning

no code implementations9 Oct 2023 Yong Lin, Fan Zhou, Lu Tan, Lintao Ma, Jiameng Liu, Yansu He, Yuan Yuan, Yu Liu, James Zhang, Yujiu Yang, Hao Wang

To address this challenge, we then propose Continuous Invariance Learning (CIL), which extracts invariant features across continuously indexed domains.

Cloud Computing

EALM: Introducing Multidimensional Ethical Alignment in Conversational Information Retrieval

1 code implementation2 Oct 2023 Yiyao Yu, Junjie Wang, Yuxiang Zhang, Lin Zhang, Yujiu Yang, Tetsuya Sakai

Artificial intelligence (AI) technologies should adhere to human norms to better serve our society and avoid disseminating harmful or misleading information, particularly in Conversational Information Retrieval (CIR).

Ethics Information Retrieval +1

ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving

1 code implementation29 Sep 2023 Zhibin Gou, Zhihong Shao, Yeyun Gong, Yelong Shen, Yujiu Yang, Minlie Huang, Nan Duan, Weizhu Chen

Large language models have made significant progress in various language tasks, yet they still struggle with complex mathematics.

Ranked #10 on Math Word Problem Solving on MATH (using extra training data)

Arithmetic Reasoning Computational Efficiency +3

Spurious Feature Diversification Improves Out-of-distribution Generalization

no code implementations29 Sep 2023 Yong Lin, Lu Tan, Yifan Hao, Honam Wong, Hanze Dong, Weizhong Zhang, Yujiu Yang, Tong Zhang

Contrary to the conventional wisdom that focuses on learning invariant features for better OOD performance, our findings suggest that incorporating a large number of diverse spurious features weakens their individual contributions, leading to improved overall OOD generalization performance.

Out-of-Distribution Generalization

Prior Bilinear Based Models for Knowledge Graph Completion

1 code implementation25 Sep 2023 Jiayi Li, Ruilin Luo, Jiaqi Sun, Jing Xiao, Yujiu Yang

Bilinear based models are powerful and widely used approaches for Knowledge Graphs Completion (KGC).

Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers

1 code implementation15 Sep 2023 Qingyan Guo, Rui Wang, Junliang Guo, Bei Li, Kaitao Song, Xu Tan, Guoqing Liu, Jiang Bian, Yujiu Yang

Large Language Models (LLMs) excel in various tasks, but they rely on carefully crafted prompts that often demand substantial human effort.

Evolutionary Algorithms

ToonTalker: Cross-Domain Face Reenactment

no code implementations ICCV 2023 Yuan Gong, Yong Zhang, Xiaodong Cun, Fei Yin, Yanbo Fan, Xuan Wang, Baoyuan Wu, Yujiu Yang

Moreover, since no paired data is provided, we propose a novel cross-domain training scheme using data from two domains with the designed analogy constraint.

Face Reenactment Talking Face Generation

AutoConv: Automatically Generating Information-seeking Conversations with Large Language Models

no code implementations12 Aug 2023 Siheng Li, Cheng Yang, Yichun Yin, Xinyu Zhu, Zesen Cheng, Lifeng Shang, Xin Jiang, Qun Liu, Yujiu Yang

Information-seeking conversation, which aims to help users gather information through conversation, has achieved great progress in recent years.

Few-Shot Learning Language Modelling

NewsDialogues: Towards Proactive News Grounded Conversation

1 code implementation12 Aug 2023 Siheng Li, Yichun Yin, Cheng Yang, Wangjie Jiang, Yiwei Li, Zesen Cheng, Lifeng Shang, Xin Jiang, Qun Liu, Yujiu Yang

In this paper, we propose a novel task, Proactive News Grounded Conversation, in which a dialogue system can proactively lead the conversation based on some key topics of the news.

Response Generation

Global and Local Semantic Completion Learning for Vision-Language Pre-training

1 code implementation12 Jun 2023 Rong-Cheng Tu, Yatai Ji, Jie Jiang, Weijie Kong, Chengfei Cai, Wenzhe Zhao, Hongfa Wang, Yujiu Yang, Wei Liu

MGSC promotes learning more representative global features, which have a great impact on the performance of downstream tasks, while MLTC reconstructs modal-fusion local tokens, further enhancing accurate comprehension of multimodal data.

Language Modelling Masked Language Modeling +5

D2Match: Leveraging Deep Learning and Degeneracy for Subgraph Matching

1 code implementation10 Jun 2023 Xuanzhou Liu, Lin Zhang, Jiaqi Sun, Yujiu Yang, Haiqin Yang

Subgraph matching is a fundamental building block for graph-based applications and is challenging due to its high-order combinatorial nature.

Combinatorial Optimization

Efficient Text-Guided 3D-Aware Portrait Generation with Score Distillation Sampling on Distribution

no code implementations3 Jun 2023 Yiji Cheng, Fei Yin, Xiaoke Huang, Xintong Yu, Jiaxiang Liu, Shikun Feng, Yujiu Yang, Yansong Tang

These elaborated designs enable our model to generate portraits with robust multi-view semantic consistency, eliminating the need for optimization-based methods.

Text to 3D

Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate

1 code implementation30 May 2023 Tian Liang, Zhiwei He, Wenxiang Jiao, Xing Wang, Yan Wang, Rui Wang, Yujiu Yang, Zhaopeng Tu, Shuming Shi

To address the DoT problem, we propose a Multi-Agent Debate (MAD) framework, in which multiple agents express their arguments in the state of "tit for tat" and a judge manages the debate process to obtain a final solution.

Arithmetic Reasoning Machine Translation

TaleCrafter: Interactive Story Visualization with Multiple Characters

1 code implementation29 May 2023 Yuan Gong, Youxin Pang, Xiaodong Cun, Menghan Xia, Yingqing He, Haoxin Chen, Longyue Wang, Yong Zhang, Xintao Wang, Ying Shan, Yujiu Yang

Accurate Story visualization requires several necessary elements, such as identity consistency across frames, the alignment between plain text and visual content, and a reasonable layout of objects in images.

Story Visualization Text-to-Image Generation

SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation

1 code implementation NeurIPS 2023 Zhuoyan Luo, Yicheng Xiao, Yong liu, Shuyan Li, Yitong Wang, Yansong Tang, Xiu Li, Yujiu Yang

To address this issue, we propose Semantic-assisted Object Cluster (SOC), which aggregates video content and textual guidance for unified temporal modeling and cross-modal alignment.

Ranked #2 on Referring Expression Segmentation on A2D Sentences (using extra training data)

Object Referring Expression Segmentation +4

Accelerating Diffusion Models for Inverse Problems through Shortcut Sampling

1 code implementation26 May 2023 Gongye Liu, Haoze Sun, Jiayi Li, Fei Yin, Yujiu Yang

Recently, diffusion models have demonstrated a remarkable ability to solve inverse problems in an unsupervised manner.

Colorization Deblurring +1

Question Answering as Programming for Solving Time-Sensitive Questions

1 code implementation23 May 2023 Xinyu Zhu, Cheng Yang, Bei Chen, Siheng Li, Jian-Guang Lou, Yujiu Yang

Question answering plays a pivotal role in human daily life because it involves our acquisition of knowledge about the world.

Natural Language Understanding Question Answering

MvP: Multi-view Prompting Improves Aspect Sentiment Tuple Prediction

1 code implementation22 May 2023 Zhibin Gou, Qingyan Guo, Yujiu Yang

Generative methods greatly promote aspect-based sentiment analysis via generating a sequence of sentiment elements in a specified format.

Aspect-Based Sentiment Analysis Aspect Category Detection +10

CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing

2 code implementations19 May 2023 Zhibin Gou, Zhihong Shao, Yeyun Gong, Yelong Shen, Yujiu Yang, Nan Duan, Weizhu Chen

Unlike these models, humans typically utilize external tools to cross-check and refine their initial content, like using a search engine for fact-checking, or a code interpreter for debugging.

Fact Checking Natural Questions +4

Recouple Event Field via Probabilistic Bias for Event Extraction

no code implementations19 May 2023 Xingyu Bai, Taiqiang Wu, Han Guo, Zhe Zhao, Xuefeng Yang, Jiayi Li, Weijie Liu, Qi Ju, Weigang Guo, Yujiu Yang

Event Extraction (EE), aiming to identify and classify event triggers and arguments from event mentions, has benefited from pre-trained language models (PLMs).

Event Extraction

Meta-Auxiliary Network for 3D GAN Inversion

no code implementations18 May 2023 Bangrui Jiang, Zhenhua Guo, Yujiu Yang

In the first stage, we invert the input image to an editable latent code using off-the-shelf inversion techniques.

Image Manipulation Meta-Learning

Feature Expansion for Graph Neural Networks

1 code implementation10 May 2023 Jiaqi Sun, Lin Zhang, Guangyi Chen, Kun Zhang, Peng Xu, Yujiu Yang

Graph neural networks aim to learn representations for graph-structured data and show impressive performance, particularly in node classification.

Node Classification Representation Learning

Exploring Human-Like Translation Strategy with Large Language Models

2 code implementations6 May 2023 Zhiwei He, Tian Liang, Wenxiang Jiao, Zhuosheng Zhang, Yujiu Yang, Rui Wang, Zhaopeng Tu, Shuming Shi, Xing Wang

Compared to typical machine translation that focuses solely on source-to-target mapping, LLM-based translation can potentially mimic the human translation process which might take preparatory steps to ensure high-quality translation.

Hallucination Machine Translation +2

RIFormer: Keep Your Vision Backbone Effective While Removing Token Mixer

2 code implementations12 Apr 2023 Jiahao Wang, Songyang Zhang, Yong liu, Taiqiang Wu, Yujiu Yang, Xihui Liu, Kai Chen, Ping Luo, Dahua Lin

Extensive experiments and ablative analysis also demonstrate that the inductive bias of network architecture, can be incorporated into simple network structure with appropriate optimization strategy.

Inductive Bias

Edge-free but Structure-aware: Prototype-Guided Knowledge Distillation from GNNs to MLPs

no code implementations24 Mar 2023 Taiqiang Wu, Zhe Zhao, Jiahao Wang, Xingyu Bai, Lei Wang, Ngai Wong, Yujiu Yang

Distilling high-accuracy Graph Neural Networks~(GNNs) to low-latency multilayer perceptrons~(MLPs) on graph tasks has become a hot research topic.

Knowledge Distillation

Global Knowledge Calibration for Fast Open-Vocabulary Segmentation

no code implementations ICCV 2023 Kunyang Han, Yong liu, Jun Hao Liew, Henghui Ding, Yunchao Wei, Jiajun Liu, Yitong Wang, Yansong Tang, Yujiu Yang, Jiashi Feng, Yao Zhao

Recent advancements in pre-trained vision-language models, such as CLIP, have enabled the segmentation of arbitrary concepts solely from textual inputs, a process commonly referred to as open-vocabulary semantic segmentation (OVS).

Knowledge Distillation Open Vocabulary Semantic Segmentation +4

Masked Autoencoders Are Stronger Knowledge Distillers

no code implementations ICCV 2023 Shanshan Lao, Guanglu Song, Boxiao Liu, Yu Liu, Yujiu Yang

In MKD, random patches of the input image are masked, and the corresponding missing feature is recovered by forcing it to imitate the output of the teacher.

Knowledge Distillation object-detection +2

Generalizable Black-Box Adversarial Attack with Meta Learning

1 code implementation1 Jan 2023 Fei Yin, Yong Zhang, Baoyuan Wu, Yan Feng, Jingyi Zhang, Yanbo Fan, Yujiu Yang

In the scenario of black-box adversarial attack, the target model's parameters are unknown, and the attacker aims to find a successful adversarial perturbation based on query feedback under a query budget.

Adversarial Attack Meta-Learning

UniKD: Universal Knowledge Distillation for Mimicking Homogeneous or Heterogeneous Object Detectors

no code implementations ICCV 2023 Shanshan Lao, Guanglu Song, Boxiao Liu, Yu Liu, Yujiu Yang

Bridging this semantic gap now requires case-by-case algorithm design which is time-consuming and heavily relies on experienced adjustment.

Knowledge Distillation

RIFormer: Keep Your Vision Backbone Effective but Removing Token Mixer

no code implementations CVPR 2023 Jiahao Wang, Songyang Zhang, Yong liu, Taiqiang Wu, Yujiu Yang, Xihui Liu, Kai Chen, Ping Luo, Dahua Lin

Extensive experiments and ablative analysis also demonstrate that the inductive bias of network architecture, can be incorporated into simple network structure with appropriate optimization strategy.

Inductive Bias

Multimodal Prototype-Enhanced Network for Few-Shot Action Recognition

no code implementations9 Dec 2022 Xinzhe Ni, Yong liu, Hao Wen, Yatai Ji, Jing Xiao, Yujiu Yang

Then in the visual flow, visual prototypes are computed by a Temporal-Relational CrossTransformer (TRX) module for example.

Few-Shot action recognition Few Shot Action Recognition +1

GLeaD: Improving GANs with A Generator-Leading Task

1 code implementation CVPR 2023 Qingyan Bai, Ceyuan Yang, Yinghao Xu, Xihui Liu, Yujiu Yang, Yujun Shen

Generative adversarial network (GAN) is formulated as a two-player game between a generator (G) and a discriminator (D), where D is asked to differentiate whether an image comes from real data or is produced by G. Under such a formulation, D plays as the rule maker and hence tends to dominate the competition.

domain classification Generative Adversarial Network +1

3D GAN Inversion with Facial Symmetry Prior

no code implementations CVPR 2023 Fei Yin, Yong Zhang, Xuan Wang, Tengfei Wang, Xiaoyu Li, Yuan Gong, Yanbo Fan, Xiaodong Cun, Ying Shan, Cengiz Oztireli, Yujiu Yang

It is natural to associate 3D GANs with GAN inversion methods to project a real image into the generator's latent space, allowing free-view consistent synthesis and editing, referred as 3D GAN inversion.

Image Reconstruction Neural Rendering

Solving Math Word Problems via Cooperative Reasoning induced Language Models

1 code implementation28 Oct 2022 Xinyu Zhu, Junjie Wang, Lin Zhang, Yuxiang Zhang, Ruyi Gan, Jiaxing Zhang, Yujiu Yang

This inspires us to develop a cooperative reasoning-induced PLM for solving MWPs, called Cooperative Reasoning (CoRe), resulting in a human-like reasoning architecture with system 1 as the generator and system 2 as the verifier.

Arithmetic Reasoning Math

MCSCSet: A Specialist-annotated Dataset for Medical-domain Chinese Spelling Correction

1 code implementation21 Oct 2022 Wangjie Jiang, Zhihao Ye, Zijing Ou, Ruihui Zhao, Jianguang Zheng, Yi Liu, Siheng Li, Bang Liu, Yujiu Yang, Yefeng Zheng

In this work, we define the task of Medical-domain Chinese Spelling Correction and propose MCSCSet, a large scale specialist-annotated dataset that contains about 200k samples.

Optical Character Recognition Optical Character Recognition (OCR) +1

Improving Your Graph Neural Networks: A High-Frequency Booster

1 code implementation15 Oct 2022 Jiaqi Sun, Lin Zhang, Shenglin Zhao, Yujiu Yang

Graph neural networks (GNNs) hold the promise of learning efficient representations of graph-structured data, and one of its most important applications is semi-supervised node classification.

Node Classification Vocal Bursts Intensity Prediction

Global Spectral Filter Memory Network for Video Object Segmentation

1 code implementation11 Oct 2022 Yong liu, Ran Yu, Jiahao Wang, Xinyuan Zhao, Yitong Wang, Yansong Tang, Yujiu Yang

Besides, we empirically find low frequency feature should be enhanced in encoder (backbone) while high frequency for decoder (segmentation head).

Attribute Object +4

Towards Real-World Video Deblurring by Exploring Blur Formation Process

1 code implementation28 Aug 2022 Mingdeng Cao, Zhihang Zhong, Yanbo Fan, Jiahao Wang, Yong Zhang, Jue Wang, Yujiu Yang, Yinqiang Zheng

We believe the novel realistic synthesis pipeline and the corresponding RAW video dataset can help the community to easily construct customized blur datasets to improve real-world video deblurring performance largely, instead of laboriously collecting real data pairs.

Deblurring

Modelling Latent Dynamics of StyleGAN using Neural ODEs

1 code implementation23 Aug 2022 Weihao Xia, Yujiu Yang, Jing-Hao Xue

The entire sequence is seen as discrete-time observations of a continuous trajectory of the initial latent code, by considering each latent code as a moving particle and the latent space as a high-dimensional dynamic system.

Video Editing

Learning Quality-aware Dynamic Memory for Video Object Segmentation

1 code implementation16 Jul 2022 Yong liu, Ran Yu, Fei Yin, Xinyuan Zhao, Wei Zhao, Weihao Xia, Yujiu Yang

However, they mainly focus on better matching between the current frame and the memory frames without explicitly paying attention to the quality of the memory.

Ranked #11 on Semi-Supervised Video Object Segmentation on DAVIS 2016 (using extra training data)

Segmentation Semantic Segmentation +2

MORE: A Metric Learning Based Framework for Open-domain Relation Extraction

1 code implementation1 Jun 2022 Yutong Wang, Renze Lou, Kai Zhang, MaoYan Chen, Yujiu Yang

To address these problems, in this work, we propose a novel learning framework named MORE (Metric learning-based Open Relation Extraction).

Clustering Metric Learning +2

Learning Adaptive Warping for Real-World Rolling Shutter Correction

1 code implementation CVPR 2022 Mingdeng Cao, Zhihang Zhong, Jiahao Wang, Yinqiang Zheng, Yujiu Yang

This paper proposes the first real-world rolling shutter (RS) correction dataset, BS-RSC, and a corresponding model to correct the RS frames in a distorted video.

Rolling Shutter Correction

EmpHi: Generating Empathetic Responses with Human-like Intents

1 code implementation NAACL 2022 Mao Yan Chen, Siheng Li, Yujiu Yang

To address the bias of the empathetic intents distribution between empathetic dialogue models and humans, we propose a novel model to generate empathetic responses with human-consistent empathetic intents, EmpHi for short.

VDTR: Video Deblurring with Transformer

1 code implementation17 Apr 2022 Mingdeng Cao, Yanbo Fan, Yong Zhang, Jue Wang, Yujiu Yang

For multi-frame temporal modeling, we adapt Transformer to fuse multiple spatial features efficiently.

Deblurring Video Restoration

High-fidelity GAN Inversion with Padding Space

1 code implementation21 Mar 2022 Qingyan Bai, Yinghao Xu, Jiapeng Zhu, Weihao Xia, Yujiu Yang, Yujun Shen

In this work, we propose to involve the padding space of the generator to complement the latent space with spatial information.

Generative Adversarial Network Image Manipulation +1

StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via Pre-trained StyleGAN

1 code implementation8 Mar 2022 Fei Yin, Yong Zhang, Xiaodong Cun, Mingdeng Cao, Yanbo Fan, Xuan Wang, Qingyan Bai, Baoyuan Wu, Jue Wang, Yujiu Yang

Our framework elevates the resolution of the synthesized talking face to 1024*1024 for the first time, even though the training dataset has a lower resolution.

Facial Editing Talking Face Generation +1

Context Enhanced Short Text Matching using Clickthrough Data

no code implementations3 Mar 2022 Mao Yan Chen, Haiyun Jiang, Yujiu Yang

The short text matching task employs a model to determine whether two short texts have the same semantic meaning or intent.

Text Matching

STaR: Knowledge Graph Embedding by Scaling, Translation and Rotation

no code implementations15 Feb 2022 Jiayi Li, Yujiu Yang

Therefore, we propose a corresponding bilinear model Scaling Translation and Rotation (STaR) consisting of the above two parts.

Knowledge Graph Embedding Link Prediction +1

Accelerating Neural Network Optimization Through an Automated Control Theory Lens

no code implementations CVPR 2022 Jiahao Wang, Baoyuan Wu, Rui Su, Mingdeng Cao, Shuwei Shi, Wanli Ouyang, Yujiu Yang

We conduct experiments both from a control theory lens through a phase locus verification and from a network training lens on several models, including CNNs, Transformers, MLPs, and on benchmark datasets.

Math

Adder Attention for Vision Transformer

4 code implementations NeurIPS 2021 Han Shu, Jiahao Wang, Hanting Chen, Lin Li, Yujiu Yang, Yunhe Wang

With the new operation, vision transformers constructed using additions can also provide powerful feature representations.

Identity-guided Face Generation with Multi-modal Contour Conditions

no code implementations10 Oct 2021 Qingyan Bai, Weihao Xia, Fei Yin, Yujiu Yang

Concretely, we propose a novel dual-encoder architecture, in which an identity encoder extracts the identity-related feature, accompanied by a main encoder to obtain the rough contour information and further fuse all the information together.

Face Generation Image Restoration

Guiding Topic Flows in the Generative Chatbot by Enhancing the ConceptNet with the Conversation Corpora

no code implementations12 Sep 2021 Pengda Si, Yao Qiu, Jinchao Zhang, Yujiu Yang

Further analysis individually proves the effectiveness of the enhanced concept graph and the Edge-Transformer architecture.

Chatbot World Knowledge

Real-time Human-Centric Segmentation for Complex Video Scenes

1 code implementation16 Aug 2021 Ran Yu, Chenyu Tian, Weihao Xia, Xinyuan Zhao, Haoqian Wang, Yujiu Yang

To alleviate this problem, we propose a mechanism named Inner Center Sampling to improve the accuracy of instance segmentation.

Instance Segmentation Segmentation +2

PoseDet: Fast Multi-Person Pose Estimation Using Pose Embedding

1 code implementation22 Jul 2021 Chenyu Tian, Ran Yu, Xinyuan Zhao, Weihao Xia, Haoqian Wang, Yujiu Yang

This simple framework achieves an unprecedented speed and a competitive accuracy on the COCO benchmark compared with state-of-the-art methods.

Multi-Person Pose Estimation

Augmenting Anchors by the Detector Itself

1 code implementation28 May 2021 Xiaopei Wan, Guoqiu Li, Yujiu Yang, Zhenhua Guo

Furthermore, AADI is a learning-based anchor augmentation method, but it does not add any parameters or hyper-parameters, which is beneficial for research and downstream tasks.

Object object-detection +1

Towards Open-World Text-Guided Face Image Generation and Manipulation

2 code implementations18 Apr 2021 Weihao Xia, Yujiu Yang, Jing-Hao Xue, Baoyuan Wu

To be specific, we propose a brand new paradigm of text-guided image generation and manipulation based on the superior characteristics of a pretrained GAN model.

Language Modelling Semantic Segmentation +1

AACP: Model Compression by Accurate and Automatic Channel Pruning

no code implementations31 Jan 2021 Lanbo Lin, Yujiu Yang, Zhenhua Guo

Firstly, AACP represents the structure of a model as a structure vector and introduces a pruning step vector to control the compressing granularity of each layer.

Model Compression Neural Architecture Search

Augmenting Proposals by the Detector Itself

no code implementations28 Jan 2021 Xiaopei Wan, Zhenhua Guo, Chao He, Yujiu Yang, Fangbo Tao

Lacking enough high quality proposals for RoI box head has impeded two-stage and multi-stage object detectors for a long time, and many previous works try to solve it via improving RPN's performance or manually generating proposals from ground truth.

GAN Inversion: A Survey

1 code implementation14 Jan 2021 Weihao Xia, Yulun Zhang, Yujiu Yang, Jing-Hao Xue, Bolei Zhou, Ming-Hsuan Yang

GAN inversion aims to invert a given image back into the latent space of a pretrained GAN model, for the image to be faithfully reconstructed from the inverted code by the generator.

Image Manipulation Image Restoration

DT-QDC: A Dataset for Question Comprehension in Online Test

1 code implementation COLING 2020 Sijin Wu, Yujiu Yang, Nicholas Yung, Zhengchen Shen, Zeyang Lei

With the transformation of education from the traditional classroom environment to online education and assessment, it is more and more important to accurately assess the difficulty of questions than ever.

Controllable Continuous Gaze Redirection

1 code implementation9 Oct 2020 Weihao Xia, Yujiu Yang, Jing-Hao Xue, Wensen Feng

The encoder maps images into a well-disentangled and hierarchically-organized latent space.

Attribute gaze redirection

Cognitive Representation Learning of Self-Media Online Article Quality

no code implementations13 Aug 2020 Yiru Wang, Shen Huang, Gongfu Li, Qiang Deng, Dongliang Liao, Pengda Si, Yujiu Yang, Jin Xu

The automatic quality assessment of self-media online articles is an urgent and new issue, which is of great value to the online recommendation and search.

Representation Learning

HGCN4MeSH: Hybrid Graph Convolution Network for MeSH Indexing

no code implementations ACL 2020 Miaomiao Yu, Yujiu Yang, Chenhui Li

Recently deep learning has been used in Medical subject headings (MeSH) indexing to reduce the time and monetary cost by manual annotation, including DeepMeSH, TextCNN, etc.

Extreme Multi-Label Classification Representation Learning

Towards Multimodal Response Generation with Exemplar Augmentation and Curriculum Optimization

no code implementations26 Apr 2020 Zeyang Lei, Zekang Li, Jinchao Zhang, Fandong Meng, Yang Feng, Yujiu Yang, Cheng Niu, Jie zhou

Furthermore, to facilitate the convergence of Gaussian mixture prior and posterior distributions, we devise a curriculum optimization strategy to progressively train the model under multiple training criteria from easy to hard.

Response Generation

HSCJN: A Holistic Semantic Constraint Joint Network for Diverse Response Generation

no code implementations1 Dec 2019 Yiru Wang, Pengda Si, Zeyang Lei, Guangxu Xun, Yujiu Yang

The sequence-to-sequence (Seq2Seq) model generates target words iteratively given the previously observed words during decoding process, which results in the loss of the holistic semantics in the target response and the complete semantic relationship between responses and dialogue histories.

Response Generation Sentence

Self-supervised Feature Learning for 3D Medical Images by Playing a Rubik's Cube

no code implementations5 Oct 2019 Xinrui Zhuang, Yuexiang Li, Yifan Hu, Kai Ma, Yujiu Yang, Yefeng Zheng

Witnessed the development of deep learning, increasing number of studies try to build computer aided diagnosis systems for 3D volumetric medical data.

Brain Tumor Segmentation Rubik's Cube +2

Multi-glance Reading Model for Text Understanding

no code implementations WS 2018 Pengcheng Zhu, Yujiu Yang, Wenqiang Gao, Yi Liu

Based on the multi-glance mechanism, we design two types of recurrent neural network models for repeated reading: Glance Cell Model (GCM) and Glance Gate Model (GGM).

Document Classification Machine Translation +2

Faster Spatially Regularized Correlation Filters for Visual Tracking

no code implementations1 Jun 2017 Xiaoxiang Hu, Yujiu Yang

Our approach achieves equivalent performance to the baseline tracker SRDCF on all three datasets.

Visual Tracking

Cannot find the paper you are looking for? You can Submit a new open access paper.