Search Results for author: Yiliang Lv

Found 11 papers, 5 papers with code

Logic Diffusion for Knowledge Graph Reasoning

no code implementations • 6 Jun 2023 • Xiaoying Xie, Biao Gong, Yiliang Lv, Zhen Han, Guoshuai Zhao, Xueming Qian

Most recent works focus on answering first order logical queries to explore the knowledge graph reasoning via multi-hop logic predictions.

Paper
Add Code

Troika: Multi-Path Cross-Modal Traction for Compositional Zero-Shot Learning

1 code implementation • 27 Mar 2023 • Siteng Huang, Biao Gong, Yutong Feng, Min Zhang, Yiliang Lv, Donglin Wang

Recent compositional zero-shot learning (CZSL) methods adapt pre-trained vision-language models (VLMs) by constructing trainable prompts only for composed state-object pairs.

Compositional Zero-Shot Learning Object

Paper
Code

Scanning Only Once: An End-to-end Framework for Fast Temporal Grounding in Long Videos

1 code implementation • ICCV 2023 • Yulin Pan, Xiangteng He, Biao Gong, Yiliang Lv, Yujun Shen, Yuxin Peng, Deli Zhao

Video temporal grounding aims to pinpoint a video segment that matches the query description.

Paper
Code

ViM: Vision Middleware for Unified Downstream Transferring

no code implementations • ICCV 2023 • Yutong Feng, Biao Gong, Jianwen Jiang, Yiliang Lv, Yujun Shen, Deli Zhao, Jingren Zhou

ViM consists of a zoo of lightweight plug-in modules, each of which is independently learned on a midstream dataset with a shared frozen backbone.

Paper
Add Code

Rethinking Efficient Tuning Methods from a Unified Perspective

no code implementations • 1 Mar 2023 • Zeyinzi Jiang, Chaojie Mao, Ziyuan Huang, Yiliang Lv, Deli Zhao, Jingren Zhou

The U-Tuning framework can simultaneously encompass existing methods and derive new approaches for parameter-efficient transfer learning, which prove to achieve on-par or better performances on CIFAR-100 and FGVC datasets when compared with existing PETL methods.

Transfer Learning

Paper
Add Code

UKnow: A Unified Knowledge Protocol for Common-Sense Reasoning and Vision-Language Pre-training

no code implementations • 14 Feb 2023 • Biao Gong, Xiaoying Xie, Yutong Feng, Yiliang Lv, Yujun Shen, Deli Zhao

This work presents a unified knowledge protocol, called UKnow, which facilitates knowledge-based studies from the perspective of data.

Common Sense Reasoning

Paper
Add Code

VoP: Text-Video Co-operative Prompt Tuning for Cross-Modal Retrieval

1 code implementation • CVPR 2023 • Siteng Huang, Biao Gong, Yulin Pan, Jianwen Jiang, Yiliang Lv, Yuyuan Li, Donglin Wang

Many recent studies leverage the pre-trained CLIP for text-video cross-modal retrieval by tuning the backbone with additional heavy modules, which not only brings huge computational burdens with much more parameters, but also leads to the knowledge forgetting from upstream models.

Cross-Modal Retrieval Retrieval +1

Paper
Code

MAR: Masked Autoencoders for Efficient Action Recognition

1 code implementation • 24 Jul 2022 • Zhiwu Qing, Shiwei Zhang, Ziyuan Huang, Xiang Wang, Yuehuan Wang, Yiliang Lv, Changxin Gao, Nong Sang

Inspired by this, we propose propose Masked Action Recognition (MAR), which reduces the redundant computation by discarding a proportion of patches and operating only on a part of the videos.

Ranked #12 on Action Recognition on Something-Something V2

Action Classification Action Recognition +1

Paper
Code

Video Similarity and Alignment Learning on Partial Video Copy Detection

no code implementations • 4 Aug 2021 • Zhen Han, Xiangteng He, Mingqian Tang, Yiliang Lv

To address the above issues, we propose the Video Similarity and Alignment Learning (VSAL) approach, which jointly models spatial similarity, temporal similarity and partial alignment.

Copy Detection Partial Video Copy Detection +1

Paper
Add Code

HANet: Hierarchical Alignment Networks for Video-Text Retrieval

1 code implementation • 26 Jul 2021 • Peng Wu, Xiangteng He, Mingqian Tang, Yiliang Lv, Jing Liu

Based on these, we naturally construct hierarchical representations in the individual-local-global manner, where the individual level focuses on the alignment between frame and word, local level focuses on the alignment between video clip and textual context, and global level focuses on the alignment between the whole video and text.

Retrieval Text Matching +3

Paper
Code

Self-supervised Video Retrieval Transformer Network

no code implementations • 16 Apr 2021 • Xiangteng He, Yulin Pan, Mingqian Tang, Yiliang Lv

In addition, most retrieval systems are based on frame-level features for video similarity searching, making it expensive both storage wise and search wise.

Retrieval Self-supervised Video Retrieval +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.