Search Results for author: Linjie Yang

Found 38 papers, 27 papers with code

Finetuned Multimodal Language Models Are High-Quality Image-Text Data Filters

no code implementations5 Mar 2024 Weizhi Wang, Khalil Mrini, Linjie Yang, Sateesh Kumar, Yu Tian, Xifeng Yan, Heng Wang

Our MLM filter can generalize to different models and tasks, and be used as a drop-in replacement for CLIPScore.

Video Recognition in Portrait Mode

1 code implementation21 Dec 2023 Mingfei Han, Linjie Yang, Xiaojie Jin, Jiashi Feng, Xiaojun Chang, Heng Wang

While existing datasets mainly comprise landscape mode videos, our paper seeks to introduce portrait mode videos to the research community and highlight the unique challenges associated with this video format.

Data Augmentation Video Recognition

Video-Teller: Enhancing Cross-Modal Generation with Fusion and Decoupling

no code implementations8 Oct 2023 Haogeng Liu, Qihang Fan, Tingkai Liu, Linjie Yang, Yunzhe Tao, Huaibo Huang, Ran He, Hongxia Yang

This paper proposes Video-Teller, a video-language foundation model that leverages multi-modal fusion and fine-grained modality alignment to significantly enhance the video-to-text generation task.

Text Generation Video Summarization

Selective Feature Adapter for Dense Vision Transformers

no code implementations3 Oct 2023 Xueqing Deng, Qi Fan, Xiaojie Jin, Linjie Yang, Peng Wang

Specifically, SFA consists of external adapters and internal adapters which are sequentially operated over a transformer model.

Depth Estimation

The Devil is in the Details: A Deep Dive into the Rabbit Hole of Data Filtering

no code implementations27 Sep 2023 Haichao Yu, Yu Tian, Sateesh Kumar, Linjie Yang, Heng Wang

DataComp is a new benchmark dedicated to evaluating different methods for data filtering.

Learning Dynamic Query Combinations for Transformer-based Object Detection and Segmentation

1 code implementation23 Jul 2023 Yiming Cui, Linjie Yang, Haichao Yu

Transformer-based detection and segmentation methods use a list of learned detection queries to retrieve information from the transformer network and learn to predict the location and category of one specific object from each query.

Instance Segmentation Object +5

Exploring the Role of Audio in Video Captioning

no code implementations21 Jun 2023 YuHan Shen, Linjie Yang, Longyin Wen, Haichao Yu, Ehsan Elhamifar, Heng Wang

Recent focus in video captioning has been on designing architectures that can consume both video and text modalities, and using large-scale video datasets with text transcripts for pre-training, such as HowTo100M.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

$R^{2}$Former: Unified $R$etrieval and $R$eranking Transformer for Place Recognition

no code implementations6 Apr 2023 Sijie Zhu, Linjie Yang, Chen Chen, Mubarak Shah, Xiaohui Shen, Heng Wang

Visual Place Recognition (VPR) estimates the location of query images by matching them with images in a reference database.

Feature Correlation Retrieval +1

FAQ: Feature Aggregated Queries for Transformer-based Video Object Detectors

1 code implementation15 Mar 2023 Yiming Cui, Linjie Yang

With Transformerbased object detectors getting a better performance on the image domain tasks, recent works began to extend those methods to video object detection.

Object object-detection +1

Revisiting Training-free NAS Metrics: An Efficient Training-based Method

1 code implementation16 Nov 2022 Taojiannan Yang, Linjie Yang, Xiaojie Jin, Chen Chen

In this paper, we revisit these training-free metrics and find that: (1) the number of parameters (\#Param), which is the most straightforward training-free metric, is overlooked in previous works but is surprisingly effective, (2) recent training-free metrics largely rely on the \#Param information to rank networks.

Neural Architecture Search

Dynamic Proposals for Efficient Object Detection

no code implementations12 Jul 2022 Yiming Cui, Linjie Yang, Ding Liu

Object detection is a basic computer vision task to loccalize and categorize objects in a given image.

Object object-detection +1

Robust High-Resolution Video Matting with Temporal Guidance

1 code implementation25 Aug 2021 Shanchuan Lin, Linjie Yang, Imran Saleemi, Soumyadip Sengupta

We introduce a robust, real-time, high-resolution human video matting method that achieves new state-of-the-art performance.

4k Image Matting +2

HR-NAS: Searching Efficient High-Resolution Neural Architectures with Lightweight Transformers

1 code implementation CVPR 2021 Mingyu Ding, Xiaochen Lian, Linjie Yang, Peng Wang, Xiaojie Jin, Zhiwu Lu, Ping Luo

Last, we proposed an efficient fine-grained search strategy to train HR-NAS, which effectively explores the search space, and finds optimal architectures given various tasks and computation resources.

Image Classification Neural Architecture Search +3

Is In-Domain Data Really Needed? A Pilot Study on Cross-Domain Calibration for Network Quantization

no code implementations16 May 2021 Haichao Yu, Linjie Yang, Humphrey Shi

Post-training quantization methods use a set of calibration data to compute quantization ranges for network parameters and activations.

Quantization

Progressive Temporal Feature Alignment Network for Video Inpainting

1 code implementation CVPR 2021 Xueyan Zou, Linjie Yang, Ding Liu, Yong Jae Lee

To achieve this goal, it is necessary to find correspondences from neighbouring frames to faithfully hallucinate the unknown content.

Optical Flow Estimation Video Inpainting

Learning Versatile Neural Architectures by Propagating Network Codes

1 code implementation ICLR 2022 Mingyu Ding, Yuqi Huo, Haoyu Lu, Linjie Yang, Zhe Wang, Zhiwu Lu, Jingdong Wang, Ping Luo

(4) Thorough studies of NCP on inter-, cross-, and intra-tasks highlight the importance of cross-task neural architecture design, i. e., multitask neural architectures and architecture transferring between different tasks.

Image Segmentation Neural Architecture Search +2

DeepViT: Towards Deeper Vision Transformer

5 code implementations22 Mar 2021 Daquan Zhou, Bingyi Kang, Xiaojie Jin, Linjie Yang, Xiaochen Lian, Zihang Jiang, Qibin Hou, Jiashi Feng

In this paper, we show that, unlike convolution neural networks (CNNs)that can be improved by stacking more convolutional layers, the performance of ViTs saturate fast when scaled to be deeper.

Image Classification Representation Learning

AutoSpace: Neural Architecture Search with Less Human Interference

1 code implementation ICCV 2021 Daquan Zhou, Xiaojie Jin, Xiaochen Lian, Linjie Yang, Yujing Xue, Qibin Hou, Jiashi Feng

Current neural architecture search (NAS) algorithms still require expert knowledge and effort to design a search space for network construction.

Neural Architecture Search

FracBits: Mixed Precision Quantization via Fractional Bit-Widths

1 code implementation4 Jul 2020 Linjie Yang, Qing Jin

Model quantization helps to reduce model size and latency of deep neural networks.

Quantization

Neural Architecture Search for Lightweight Non-Local Networks

2 code implementations CVPR 2020 Yingwei Li, Xiaojie Jin, Jieru Mei, Xiaochen Lian, Linjie Yang, Cihang Xie, Qihang Yu, Yuyin Zhou, Song Bai, Alan Yuille

However, it has been rarely explored to embed the NL blocks in mobile neural networks, mainly due to the following challenges: 1) NL blocks generally have heavy computation cost which makes it difficult to be applied in applications where computational resources are limited, and 2) it is an open problem to discover an optimal configuration to embed NL blocks into mobile neural networks.

Image Classification Neural Architecture Search

Towards Efficient Training for Neural Network Quantization

3 code implementations21 Dec 2019 Qing Jin, Linjie Yang, Zhenyu Liao

To deal with this problem, we propose a simple yet effective technique, named scale-adjusted training (SAT), to comply with the discovered rules and facilitates efficient training.

Quantization

AdaBits: Neural Network Quantization with Adaptive Bit-Widths

1 code implementation CVPR 2020 Qing Jin, Linjie Yang, Zhenyu Liao

With our proposed techniques applied on a bunch of models including MobileNet-V1/V2 and ResNet-50, we demonstrate that bit-width of weights and activations is a new option for adaptively executable deep neural networks, offering a distinct opportunity for improved accuracy-efficiency trade-off as well as instant adaptation according to the platform constraints in real-world applications.

Quantization

AtomNAS: Fine-Grained End-to-End Neural Architecture Search

1 code implementation ICLR 2020 Jieru Mei, Yingwei Li, Xiaochen Lian, Xiaojie Jin, Linjie Yang, Alan Yuille, Jianchao Yang

We propose a fine-grained search space comprised of atomic blocks, a minimal search unit that is much smaller than the ones used in recent NAS algorithms.

Neural Architecture Search

Rethinking Neural Network Quantization

no code implementations25 Sep 2019 Qing Jin, Linjie Yang, Zhenyu Liao

To deal with this problem, we propose a simple yet effective technique, named scale-adjusted training (SAT), to comply with the discovered rules and facilitates efficient training.

Quantization

Weakly Supervised Body Part Segmentation with Pose based Part Priors

no code implementations30 Jul 2019 Zhengyuan Yang, Yuncheng Li, Linjie Yang, Ning Zhang, Jiebo Luo

The core idea is first converting the sparse weak labels such as keypoints to the initial estimate of body part masks, and then iteratively refine the part mask predictions.

Face Parsing Segmentation +1

Context-Aware Zero-Shot Recognition

1 code implementation19 Apr 2019 Ruotian Luo, Ning Zhang, Bohyung Han, Linjie Yang

We present a novel problem setting in zero-shot learning, zero-shot object recognition and detection in the context.

Object Recognition Zero-Shot Learning

Streamlined Dense Video Captioning

1 code implementation CVPR 2019 Jonghwan Mun, Linjie Yang, Zhou Ren, Ning Xu, Bohyung Han

Dense video captioning is an extremely challenging task since accurate and coherent description of events in a video requires holistic understanding of video contents as well as contextual reasoning of individual events.

Dense Video Captioning

Slimmable Neural Networks

3 code implementations ICLR 2019 Jiahui Yu, Linjie Yang, Ning Xu, Jianchao Yang, Thomas Huang

Instead of training individual networks with different width configurations, we train a shared network with switchable batch normalization.

Instance Segmentation Keypoint Detection +3

YouTube-VOS: A Large-Scale Video Object Segmentation Benchmark

no code implementations6 Sep 2018 Ning Xu, Linjie Yang, Yuchen Fan, Dingcheng Yue, Yuchen Liang, Jianchao Yang, Thomas Huang

End-to-end sequential learning to explore spatialtemporal features for video segmentation is largely limited by the scale of available video segmentation datasets, i. e., even the largest video segmentation dataset only contains 90 short video clips.

Image Segmentation Object +6

YouTube-VOS: Sequence-to-Sequence Video Object Segmentation

4 code implementations ECCV 2018 Ning Xu, Linjie Yang, Yuchen Fan, Jianchao Yang, Dingcheng Yue, Yuchen Liang, Brian Price, Scott Cohen, Thomas Huang

End-to-end sequential learning to explore spatial-temporal features for video segmentation is largely limited by the scale of available video segmentation datasets, i. e., even the largest video segmentation dataset only contains 90 short video clips.

Ranked #12 on Video Object Segmentation on YouTube-VOS 2018 (F-Measure (Unseen) metric)

Image Segmentation Object +7

Efficient Video Object Segmentation via Network Modulation

1 code implementation CVPR 2018 Linjie Yang, Yanran Wang, Xuehan Xiong, Jianchao Yang, Aggelos K. Katsaggelos

Video object segmentation targets at segmenting a specific object throughout a video sequence, given only an annotated first frame.

Object Segmentation +5

Dense Captioning with Joint Inference and Visual Context

1 code implementation CVPR 2017 Linjie Yang, Kevin Tang, Jianchao Yang, Li-Jia Li

The goal is to densely detect visual concepts (e. g., objects, object parts, and interactions between them) from images, labeling each with a short descriptive phrase.

Dense Captioning Descriptive

Cannot find the paper you are looking for? You can Submit a new open access paper.