Search Results for author: Jianwen Jiang

Found 18 papers, 8 papers with code

Superior and Pragmatic Talking Face Generation with Teacher-Student Framework

no code implementations • 26 Mar 2024 • Chao Liang, Jianwen Jiang, Tianyun Zhong, Gaojie Lin, Zhengkun Rong, Jiaqi Yang, Yongming Zhu

Talking face generation technology creates talking videos from arbitrary appearance and motion signal, with the "arbitrary" offering ease of use but also introducing challenges in practical applications.

Talking Face Generation

Paper
Add Code

RLIPv2: Fast Scaling of Relational Language-Image Pre-training

3 code implementations • ICCV 2023 • Hangjie Yuan, Shiwei Zhang, Xiang Wang, Samuel Albanie, Yining Pan, Tao Feng, Jianwen Jiang, Dong Ni, Yingya Zhang, Deli Zhao

In this paper, we propose RLIPv2, a fast converging model that enables the scaling of relational pre-training to large-scale pseudo-labelled scene graph data.

Ranked #1 on Zero-Shot Human-Object Interaction Detection on HICO-DET (using extra training data)

Graph Generation Human-Object Interaction Detection +6

Paper
Code

ViM: Vision Middleware for Unified Downstream Transferring

no code implementations • ICCV 2023 • Yutong Feng, Biao Gong, Jianwen Jiang, Yiliang Lv, Yujun Shen, Deli Zhao, Jingren Zhou

ViM consists of a zoo of lightweight plug-in modules, each of which is independently learned on a midstream dataset with a shared frozen backbone.

Paper
Add Code

VoP: Text-Video Co-operative Prompt Tuning for Cross-Modal Retrieval

1 code implementation • CVPR 2023 • Siteng Huang, Biao Gong, Yulin Pan, Jianwen Jiang, Yiliang Lv, Yuyuan Li, Donglin Wang

Many recent studies leverage the pre-trained CLIP for text-video cross-modal retrieval by tuning the backbone with additional heavy modules, which not only brings huge computational burdens with much more parameters, but also leads to the knowledge forgetting from upstream models.

Cross-Modal Retrieval Retrieval +1

Paper
Code

Grow and Merge: A Unified Framework for Continuous Categories Discovery

no code implementations • 9 Oct 2022 • Xinwei Zhang, Jianwen Jiang, Yutong Feng, Zhi-Fan Wu, Xibin Zhao, Hai Wan, Mingqian Tang, Rong Jin, Yue Gao

Although a number of studies are devoted to novel category discovery, most of them assume a static setting where both labeled and unlabeled data are given at once for finding new categories.

Self-Supervised Learning

Paper
Add Code

RLIP: Relational Language-Image Pre-training for Human-Object Interaction Detection

3 code implementations • 5 Sep 2022 • Hangjie Yuan, Jianwen Jiang, Samuel Albanie, Tao Feng, Ziyuan Huang, Dong Ni, Mingqian Tang

The task of Human-Object Interaction (HOI) detection targets fine-grained visual parsing of humans interacting with their environment, enabling a broad range of applications.

Ranked #16 on Human-Object Interaction Detection on HICO-DET

Human-Object Interaction Detection Relation +1

Paper
Code

Rethinking Supervised Pre-training for Better Downstream Transferring

no code implementations • ICLR 2022 • Yutong Feng, Jianwen Jiang, Mingqian Tang, Rong Jin, Yue Gao

Though for most cases, the pre-training stage is conducted based on supervised methods, recent works on self-supervised pre-training have shown powerful transferability and even outperform supervised pre-training on multiple downstream tasks.

Open-Ended Question Answering

Paper
Add Code

NGC: A Unified Framework for Learning with Open-World Noisy Data

no code implementations • ICCV 2021 • Zhi-Fan Wu, Tong Wei, Jianwen Jiang, Chaojie Mao, Mingqian Tang, Yu-Feng Li

The existence of noisy data is prevalent in both the training and testing phases of machine learning systems, which inevitably leads to the degradation of model performance.

Ranked #18 on Image Classification on mini WebVision 1.0

Image Classification

Paper
Add Code

Exploring Stronger Feature for Temporal Action Localization

no code implementations • 24 Jun 2021 • Zhiwu Qing, Xiang Wang, Ziyuan Huang, Yutong Feng, Shiwei Zhang, Jianwen Jiang, Mingqian Tang, Changxin Gao, Nong Sang

Temporal action localization aims to localize starting and ending time with action category.

Temporal Action Localization

Paper
Add Code

Weakly-Supervised Temporal Action Localization Through Local-Global Background Modeling

no code implementations • 20 Jun 2021 • Xiang Wang, Zhiwu Qing, Ziyuan Huang, Yutong Feng, Shiwei Zhang, Jianwen Jiang, Mingqian Tang, Yuanjie Shao, Nong Sang

Then our proposed Local-Global Background Modeling Network (LGBM-Net) is trained to localize instances by using only video-level labels based on Multi-Instance Learning (MIL).

Weakly-supervised Learning Weakly-supervised Temporal Action Localization +1

Paper
Add Code

Proposal Relation Network for Temporal Action Detection

1 code implementation • 20 Jun 2021 • Xiang Wang, Zhiwu Qing, Ziyuan Huang, Yutong Feng, Shiwei Zhang, Jianwen Jiang, Mingqian Tang, Changxin Gao, Nong Sang

We calculate the detection results by assigning the proposals with corresponding classification results.

Ranked #2 on Temporal Action Localization on ActivityNet-1.3 (using extra training data)

Action Classification Action Detection +3

Paper
Code

Relation Modeling in Spatio-Temporal Action Localization

no code implementations • 15 Jun 2021 • Yutong Feng, Jianwen Jiang, Ziyuan Huang, Zhiwu Qing, Xiang Wang, Shiwei Zhang, Mingqian Tang, Yue Gao

This paper presents our solution to the AVA-Kinetics Crossover Challenge of ActivityNet workshop at CVPR 2021.

Ranked #4 on Spatio-Temporal Action Localization on AVA-Kinetics (using extra training data)

Action Detection Relation +2

Paper
Add Code

A Stronger Baseline for Ego-Centric Action Detection

1 code implementation • 13 Jun 2021 • Zhiwu Qing, Ziyuan Huang, Xiang Wang, Yutong Feng, Shiwei Zhang, Jianwen Jiang, Mingqian Tang, Changxin Gao, Marcelo H. Ang Jr, Nong Sang

This technical report analyzes an egocentric video action detection method we used in the 2021 EPIC-KITCHENS-100 competition hosted in CVPR2021 Workshop.

Action Detection

215

Paper
Code

Towards Training Stronger Video Vision Transformers for EPIC-KITCHENS-100 Action Recognition

1 code implementation • 9 Jun 2021 • Ziyuan Huang, Zhiwu Qing, Xiang Wang, Yutong Feng, Shiwei Zhang, Jianwen Jiang, Zhurong Xia, Mingqian Tang, Nong Sang, Marcelo H. Ang Jr

In this paper, we present empirical results for training a stronger video vision transformer on the EPIC-KITCHENS-100 Action Recognition dataset.

Action Recognition Point Cloud Classification +1

215

Paper
Code

Self-supervised Motion Learning from Static Images

1 code implementation • CVPR 2021 • Ziyuan Huang, Shiwei Zhang, Jianwen Jiang, Mingqian Tang, Rong Jin, Marcelo Ang

We furthermore introduce a static mask in pseudo motions to create local motion patterns, which forces the model to additionally locate notable motion areas for the correct classification. We demonstrate that MoSI can discover regions with large motion even without fine-tuning on the downstream datasets.

Action Recognition Self-Supervised Learning

215

Paper
Code

Incremental Learning on Growing Graphs

no code implementations • 1 Jan 2021 • Yutong Feng, Jianwen Jiang, Yue Gao

To tackle this problem, we introduce incremental graph learning (IGL), a general framework to formulate the learning on growing graphs in an incremental manner, where traditional graph learning method could be deployed as a basic model.

Graph Learning Incremental Learning +2

Paper
Add Code

Self-Supervised Video Representation Learning with Constrained Spatiotemporal Jigsaw

no code implementations • 1 Jan 2021 • Yuqi Huo, Mingyu Ding, Haoyu Lu, Zhiwu Lu, Tao Xiang, Ji-Rong Wen, Ziyuan Huang, Jianwen Jiang, Shiwei Zhang, Mingqian Tang, Songfang Huang, Ping Luo

With the constrained jigsaw puzzles, instead of solving them directly, which could still be extremely hard, we carefully design four surrogate tasks that are more solvable but meanwhile still ensure that the learned representation is sensitive to spatiotemporal continuity at both the local and global levels.

Representation Learning

Paper
Add Code

DHGNN: Dynamic Hypergraph Neural Networks

1 code implementation • 1 Jul 2019 • Jianwen Jiang, Yuxuan Wei, Yifan Feng, Jingxuan Cao, Yue Gao

Then hypergraph convolution is introduced to encode high-order data relations in a hypergraph structure.

218

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.