Search Results for author: Zhangyang Wang

Found 357 papers, 221 papers with code

HALO: Hardware-Aware Learning to Optimize

1 code implementation • ECCV 2020 • Chaojian Li, Tianlong Chen, Haoran You, Zhangyang Wang, Yingyan Lin

There has been an explosive demand for bringing machine learning (ML) powered intelligence into numerous Internet-of-Things (IoT) devices.

Paper
Code

Eliminating the Invariance on the Loss Landscape of Linear Autoencoders

no code implementations • ICML 2020 • Reza Oftadeh, Jiayi Shen, Zhangyang Wang, Dylan Shell

For this new loss, we characterize the full structure of the loss landscape in the following sense: we establish analytical expression for the set of all critical points, show that it is a subset of critical points of MSE, and that all local minima are still global.

Paper
Add Code

OpenBias: Open-set Bias Detection in Text-to-Image Generative Models

no code implementations • 11 Apr 2024 • Moreno D'Incà, Elia Peruzzo, Massimiliano Mancini, Dejia Xu, Vidit Goel, Xingqian Xu, Zhangyang Wang, Humphrey Shi, Nicu Sebe

In this paper, we tackle the challenge of open-set bias detection in text-to-image generative models presenting OpenBias, a new pipeline that identifies and quantifies the severity of biases agnostically, without access to any precompiled set.

Bias Detection Fairness +3

Paper
Add Code

DreamScene360: Unconstrained Text-to-3D Scene Generation with Panoramic Gaussian Splatting

no code implementations • 10 Apr 2024 • Shijie Zhou, Zhiwen Fan, Dejia Xu, Haoran Chang, Pradyumna Chari, Tejas Bharadwaj, Suya You, Zhangyang Wang, Achuta Kadambi

This point cloud serves as the initial state for the centroids of 3D Gaussians.

Scene Generation Text to 3D

Paper
Add Code

MM3DGS SLAM: Multi-modal 3D Gaussian Splatting for SLAM Using Vision, Depth, and Inertial Measurements

no code implementations • 1 Apr 2024 • Lisong C. Sun, Neel P. Bhatt, Jonathan C. Liu, Zhiwen Fan, Zhangyang Wang, Todd E. Humphreys, Ufuk Topcu

We show for the first time that using 3D Gaussians for map representation with unposed camera images and inertial measurements can enable accurate SLAM.

Scene Understanding Simultaneous Localization and Mapping

Paper
Add Code

InstantSplat: Unbounded Sparse-view Pose-free Gaussian Splatting in 40 Seconds

no code implementations • 29 Mar 2024 • Zhiwen Fan, Wenyan Cong, Kairun Wen, Kevin Wang, Jian Zhang, Xinghao Ding, Danfei Xu, Boris Ivanovic, Marco Pavone, Georgios Pavlakos, Zhangyang Wang, Yue Wang

This pre-processing is usually conducted via a Structure-from-Motion (SfM) pipeline, a procedure that can be slow and unreliable, particularly in sparse-view scenarios with insufficient matched features for accurate reconstruction.

Novel View Synthesis SSIM

Paper
Add Code

Lift3D: Zero-Shot Lifting of Any 2D Vision Model to 3D

no code implementations • 27 Mar 2024 • Mukund Varma T, Peihao Wang, Zhiwen Fan, Zhangyang Wang, Hao Su, Ravi Ramamoorthi

In recent years, there has been an explosion of 2D vision models for numerous tasks such as semantic segmentation, style transfer or scene editing, enabled by large-scale 2D image datasets.

Colorization Image Colorization +3

Paper
Add Code

Generalization Error Analysis for Sparse Mixture-of-Experts: A Preliminary Study

no code implementations • 26 Mar 2024 • Jinze Zhao, Peihao Wang, Zhangyang Wang

Specifically, we investigate the impact of the number of data samples, the total number of experts, the sparsity in expert selection, the complexity of the routing mechanism, and the complexity of individual experts.

Learning Theory

Paper
Add Code

Comp4D: LLM-Guided Compositional 4D Scene Generation

no code implementations • 25 Mar 2024 • Dejia Xu, Hanwen Liang, Neel P. Bhatt, Hezhen Hu, Hanxue Liang, Konstantinos N. Plataniotis, Zhangyang Wang

Recent advancements in diffusion models for 2D and 3D content creation have sparked a surge of interest in generating 4D content.

Object Scene Generation +1

Paper
Add Code

StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text

1 code implementation • 21 Mar 2024 • Roberto Henschel, Levon Khachatryan, Daniil Hayrapetyan, Hayk Poghosyan, Vahram Tadevosyan, Zhangyang Wang, Shant Navasardyan, Humphrey Shi

To overcome these limitations, we introduce StreamingT2V, an autoregressive approach for long video generation of 80, 240, 600, 1200 or more frames with smooth transitions.

Text-to-Video Generation Video Generation

768

Paper
Code

Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression

no code implementations • 18 Mar 2024 • Junyuan Hong, Jinhao Duan, Chenhui Zhang, Zhangheng Li, Chulin Xie, Kelsey Lieberman, James Diffenderfer, Brian Bartoldson, Ajay Jaiswal, Kaidi Xu, Bhavya Kailkhura, Dan Hendrycks, Dawn Song, Zhangyang Wang, Bo Li

While state-of-the-art (SoTA) compression methods boast impressive advancements in preserving benign task performance, the potential risks of compression in terms of safety and trustworthiness have been largely neglected.

Ethics Fairness +1

Paper
Add Code

Shake to Leak: Fine-tuning Diffusion Models Can Amplify the Generative Privacy Risk

1 code implementation • 14 Mar 2024 • Zhangheng Li, Junyuan Hong, Bo Li, Zhangyang Wang

While diffusion models have recently demonstrated remarkable progress in generating realistic images, privacy risks also arise: published models or APIs could generate training images and thus leak privacy-sensitive training information.

Inference Attack Membership Inference Attack

Paper
Code

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

1 code implementation • 6 Mar 2024 • Jiawei Zhao, Zhenyu Zhang, Beidi Chen, Zhangyang Wang, Anima Anandkumar, Yuandong Tian

Our approach reduces memory usage by up to 65. 5% in optimizer states while maintaining both efficiency and performance for pre-training on LLaMA 1B and 7B architectures with C4 dataset with up to 19. 7B tokens, and on fine-tuning RoBERTa on GLUE tasks.

1,085

Paper
Code

Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding

1 code implementation • 5 Mar 2024 • Zhenyu Zhang, Runjin Chen, Shiwei Liu, Zhewei Yao, Olatunji Ruwase, Beidi Chen, Xiaoxia Wu, Zhangyang Wang

To address this problem, this paper introduces Multi-scale Positional Encoding (Ms-PoE) which is a simple yet effective plug-and-play approach to enhance the capacity of LLMs to handle the relevant information located in the middle of the context, without fine-tuning or introducing any additional overhead.

Language Modelling

Paper
Code

Principled Architecture-aware Scaling of Hyperparameters

1 code implementation • 27 Feb 2024 • Wuyang Chen, Junru Wu, Zhangyang Wang, Boris Hanin

However, most designs or optimization methods are agnostic to the choice of network structures, and thus largely ignore the impact of neural architectures on hyperparameters.

AutoML

Paper
Code

Take the Bull by the Horns: Hard Sample-Reweighted Continual Training Improves LLM Generalization

1 code implementation • 22 Feb 2024 • Xuxi Chen, Zhendong Wang, Daouda Sow, Junjie Yang, Tianlong Chen, Yingbin Liang, Mingyuan Zhou, Zhangyang Wang

Our study starts from an empirical strategy for the light continual training of LLMs using their original pre-training data sets, with a specific focus on selective retention of samples that incur moderately high losses.

Paper
Code

Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark

1 code implementation • 18 Feb 2024 • Yihua Zhang, Pingzhi Li, Junyuan Hong, Jiaxiang Li, Yimeng Zhang, Wenqing Zheng, Pin-Yu Chen, Jason D. Lee, Wotao Yin, Mingyi Hong, Zhangyang Wang, Sijia Liu, Tianlong Chen

In the evolving landscape of natural language processing (NLP), fine-tuning pre-trained Large Language Models (LLMs) with first-order (FO) optimizers like SGD and Adam has become standard.

Benchmarking

Paper
Code

Social Reward: Evaluating and Enhancing Generative AI through Million-User Feedback from an Online Creative Community

1 code implementation • 15 Feb 2024 • Arman Isajanyan, Artur Shatveryan, David Kocharyan, Zhangyang Wang, Humphrey Shi

These findings highlight the relevance and effectiveness of Social Reward in assessing community appreciation for AI-generated artworks, establishing a closer alignment with users' creative goals: creating popular visual art.

Image Generation

Paper
Code

Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference

1 code implementation • 14 Feb 2024 • Harry Dong, Xinyu Yang, Zhenyu Zhang, Zhangyang Wang, Yuejie Chi, Beidi Chen

Many computational factors limit broader deployment of large language models.

Paper
Code

LLaGA: Large Language and Graph Assistant

2 code implementations • 13 Feb 2024 • Runjin Chen, Tong Zhao, Ajay Jaiswal, Neil Shah, Zhangyang Wang

Graph Neural Networks (GNNs) have empowered the advance in graph-structured data analysis.

Paper
Code

QuantumSEA: In-Time Sparse Exploration for Noise Adaptive Quantum Circuits

1 code implementation • 10 Jan 2024 • Tianlong Chen, Zhenyu Zhang, Hanrui Wang, Jiaqi Gu, Zirui Li, David Z. Pan, Frederic T. Chong, Song Han, Zhangyang Wang

To address these two pain points, we propose QuantumSEA, an in-time sparse exploration for noise-adaptive quantum circuits, aiming to achieve two key objectives: (1) implicit circuits capacity during training - by dynamically exploring the circuit's sparse connectivity and sticking a fixed small number of quantum gates throughout the training which satisfies the coherence time and enjoy light noises, enabling feasible executions on real quantum devices; (2) noise robustness - by jointly optimizing the topology and parameters of quantum circuits under real device noise models.

Quantum Machine Learning

Paper
Code

AGG: Amortized Generative 3D Gaussians for Single Image to 3D

no code implementations • 8 Jan 2024 • Dejia Xu, Ye Yuan, Morteza Mardani, Sifei Liu, Jiaming Song, Zhangyang Wang, Arash Vahdat

To overcome these challenges, we introduce an Amortized Generative 3D Gaussian framework (AGG) that instantly produces 3D Gaussians from a single image, eliminating the need for per-instance optimization.

3D Generation 3D Reconstruction +2

Paper
Add Code

VASE: Object-Centric Appearance and Shape Manipulation of Real Videos

no code implementations • 4 Jan 2024 • Elia Peruzzo, Vidit Goel, Dejia Xu, Xingqian Xu, Yifan Jiang, Zhangyang Wang, Humphrey Shi, Nicu Sebe

Recently, several works tackled the video editing task fostered by the success of large-scale text-to-image generative models.

Video Editing

Paper
Add Code

SteinDreamer: Variance Reduction for Text-to-3D Score Distillation via Stein Identity

no code implementations • 31 Dec 2023 • Peihao Wang, Zhiwen Fan, Dejia Xu, Dilin Wang, Sreyas Mohan, Forrest Iandola, Rakesh Ranjan, Yilei Li, Qiang Liu, Zhangyang Wang, Vikas Chandra

In this paper, we reveal that the gradient estimation in score distillation is inherent to high variance.

Text to 3D

Paper
Add Code

Taming Mode Collapse in Score Distillation for Text-to-3D Generation

no code implementations • 31 Dec 2023 • Peihao Wang, Dejia Xu, Zhiwen Fan, Dilin Wang, Sreyas Mohan, Forrest Iandola, Rakesh Ranjan, Yilei Li, Qiang Liu, Zhangyang Wang, Vikas Chandra

In this paper, we reveal that the existing score distillation-based text-to-3D generation frameworks degenerate to maximal likelihood seeking on each view independently and thus suffer from the mode collapse problem, manifesting as the Janus artifact in practice.

3D Generation Prompt Engineering +1

Paper
Add Code

4DGen: Grounded 4D Content Generation with Spatial-temporal Consistency

no code implementations • 28 Dec 2023 • Yuyang Yin, Dejia Xu, Zhangyang Wang, Yao Zhao, Yunchao Wei

Our pipeline facilitates conditional 4D generation, enabling users to specify geometry (3D assets) and motion (monocular videos), thus offering superior control over content creation.

Prompt Engineering

Paper
Add Code

HD-Painter: High-Resolution and Prompt-Faithful Text-Guided Image Inpainting with Diffusion Models

1 code implementation • 21 Dec 2023 • Hayk Manukyan, Andranik Sargsyan, Barsegh Atanyan, Zhangyang Wang, Shant Navasardyan, Humphrey Shi

Recent progress in text-guided image inpainting, based on the unprecedented success of text-to-image diffusion models, has led to exceptionally realistic and visually plausible results.

2k Image Inpainting +1

221

Paper
Code

The Counterattack of CNNs in Self-Supervised Learning: Larger Kernel Size might be All You Need

no code implementations • 9 Dec 2023 • Tianjin Huang, Tianlong Chen, Zhangyang Wang, Shiwei Liu

Therefore, it remains unclear whether the self-attention operation is crucial for the recent advances in SSL - or CNNs can deliver the same excellence with more advanced designs, too?

Self-Supervised Learning

Paper
Add Code

Feature 3DGS: Supercharging 3D Gaussian Splatting to Enable Distilled Feature Fields

no code implementations • 6 Dec 2023 • Shijie Zhou, Haoran Chang, Sicheng Jiang, Zhiwen Fan, Zehao Zhu, Dejia Xu, Pradyumna Chari, Suya You, Zhangyang Wang, Achuta Kadambi

In this work, we go one step further: in addition to radiance field rendering, we enable 3D Gaussian splatting on arbitrary-dimension semantic features via 2D foundation model distillation.

Novel View Synthesis Semantic Segmentation

Paper
Add Code

Meta ControlNet: Enhancing Task Adaptation via Meta Learning

1 code implementation • 3 Dec 2023 • Junjie Yang, Jinze Zhao, Peihao Wang, Zhangyang Wang, Yingbin Liang

However, vanilla ControlNet generally requires extensive training of around 5000 steps to achieve a desirable control for a single task.

Edge Detection Image Generation +1

Paper
Code

Rethinking PGD Attack: Is Sign Function Necessary?

1 code implementation • 3 Dec 2023 • Junjie Yang, Tianlong Chen, Xuxi Chen, Zhangyang Wang, Yingbin Liang

Based on that, we further propose a new raw gradient descent (RGD) algorithm that eliminates the use of sign.

Paper
Code

FSGS: Real-Time Few-shot View Synthesis using Gaussian Splatting

no code implementations • 1 Dec 2023 • Zehao Zhu, Zhiwen Fan, Yifan Jiang, Zhangyang Wang

Novel view synthesis from limited observations remains an important and persistent task.

Novel View Synthesis

Paper
Add Code

LightGaussian: Unbounded 3D Gaussian Compression with 15x Reduction and 200+ FPS

1 code implementation • 28 Nov 2023 • Zhiwen Fan, Kevin Wang, Kairun Wen, Zehao Zhu, Dejia Xu, Zhangyang Wang

Recent advancements in real-time neural rendering using point-based techniques have paved the way for the widespread adoption of 3D representations.

Network Pruning Neural Rendering +2

428

Paper
Code

DP-OPT: Make Large Language Model Your Privacy-Preserving Prompt Engineer

1 code implementation • 27 Nov 2023 • Junyuan Hong, Jiachen T. Wang, Chenhui Zhang, Zhangheng Li, Bo Li, Zhangyang Wang

To ensure that the prompts do not leak private information, we introduce the first private prompt generation mechanism, by a differentially-private (DP) ensemble of in-context learning with private demonstrations.

In-Context Learning Language Modelling +3

Paper
Code

Fine-Tuning Language Models Using Formal Methods Feedback

no code implementations • 27 Oct 2023 • Yunhao Yang, Neel P. Bhatt, Tyler Ingebrand, William Ward, Steven Carr, Zhangyang Wang, Ufuk Topcu

Although pre-trained language models encode generic knowledge beneficial for planning and control, they may fail to generate appropriate control policies for domain-specific tasks.

Autonomous Driving

Paper
Add Code

Towards long-tailed, multi-label disease classification from chest X-ray: Overview of the CXR-LT challenge

no code implementations • 24 Oct 2023 • Gregory Holste, Yiliang Zhou, Song Wang, Ajay Jaiswal, Mingquan Lin, Sherry Zhuge, Yuzhe Yang, Dongkyun Kim, Trong-Hieu Nguyen-Mau, Minh-Triet Tran, Jaehyup Jeong, Wongi Park, Jongbin Ryu, Feng Hong, Arsh Verma, Yosuke Yamagishi, Changhyun Kim, Hyeryeong Seo, Myungjoo Kang, Leo Anthony Celi, Zhiyong Lu, Ronald M. Summers, George Shih, Zhangyang Wang, Yifan Peng

Many real-world image recognition problems, such as diagnostic medical imaging exams, are "long-tailed" $\unicode{x2013}$ there are a few common findings followed by many more relatively rare conditions.

Image Classification Medical Image Classification

Paper
Add Code

Multi-Concept T2I-Zero: Tweaking Only The Text Embeddings and Nothing Else

no code implementations • 11 Oct 2023 • Hazarapet Tunanyan, Dejia Xu, Shant Navasardyan, Zhangyang Wang, Humphrey Shi

To achieve this goal, we identify the limitations in the text embeddings used for the pre-trained text-to-image diffusion models.

Image Manipulation Text-to-Image Generation

Paper
Add Code

Data Distillation Can Be Like Vodka: Distilling More Times For Better Quality

no code implementations • 10 Oct 2023 • Xuxi Chen, Yu Yang, Zhangyang Wang, Baharan Mirzasoleiman

Dataset distillation aims to minimize the time and memory needed for training deep networks on large datasets, by creating a small set of synthetic images that has a similar generalization performance to that of the full dataset.

Paper
Add Code

Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity

1 code implementation • 8 Oct 2023 • Lu Yin, You Wu, Zhenyu Zhang, Cheng-Yu Hsieh, Yaqing Wang, Yiling Jia, Mykola Pechenizkiy, Yi Liang, Zhangyang Wang, Shiwei Liu

Large Language Models (LLMs), renowned for their remarkable performance across diverse domains, present a challenge when it comes to practical deployment due to their colossal model size.

Network Pruning

Paper
Code

Pose-Free Generalizable Rendering Transformer

no code implementations • 5 Oct 2023 • Zhiwen Fan, Panwang Pan, Peihao Wang, Yifan Jiang, Hanwen Jiang, Dejia Xu, Zehao Zhu, Dilin Wang, Zhangyang Wang

To address this challenge, we introduce PF-GRT, a new Pose-Free framework for Generalizable Rendering Transformer, eliminating the need for pre-computed camera poses and instead leveraging feature-matching learned directly from data.

Generalizable Novel View Synthesis Novel View Synthesis

Paper
Add Code

Efficient-3DiM: Learning a Generalizable Single-image Novel-view Synthesizer in One Day

1 code implementation • 4 Oct 2023 • Yifan Jiang, Hao Tang, Jen-Hao Rick Chang, Liangchen Song, Zhangyang Wang, Liangliang Cao

Although the fidelity and generalizability are greatly improved, training such a powerful diffusion model requires a vast volume of training data and model parameters, resulting in a notoriously long time and high computational costs.

Image Generation Novel View Synthesis

Paper
Code

Compressing LLMs: The Truth is Rarely Pure and Never Simple

1 code implementation • 2 Oct 2023 • Ajay Jaiswal, Zhe Gan, Xianzhi Du, BoWen Zhang, Zhangyang Wang, Yinfei Yang

Recently, several works have shown significant success in training-free and data-free compression (pruning and quantization) of LLMs that achieve 50 - 60% sparsity and reduce the bit width to 3 or 4 bits per weight, with negligible degradation of perplexity over the uncompressed baseline.

Quantization Retrieval

Paper
Code

Do Compressed LLMs Forget Knowledge? An Experimental Study with Practical Implications

no code implementations • 2 Oct 2023 • Duc N. M Hoang, Minsik Cho, Thomas Merth, Mohammad Rastegari, Zhangyang Wang

We start by proposing two conjectures on the nature of the damage: one is certain knowledge being forgotten (or erased) after LLM compression, hence necessitating the compressed model to (re)learn from data with additional parameters; the other presumes that knowledge is internally displaced and hence one requires merely "inference re-direction" with input-side augmentation such as prompting, to recover the knowledge-related performance.

Paper
Add Code

Pruning Small Pre-Trained Weights Irreversibly and Monotonically Impairs "Difficult" Downstream Tasks in LLMs

1 code implementation • 29 Sep 2023 • Lu Yin, Ajay Jaiswal, Shiwei Liu, Souvik Kundu, Zhangyang Wang

Contrary to this belief, this paper presents a counter-argument: small-magnitude weights of pre-trained model weights encode vital knowledge essential for tackling difficult downstream tasks - manifested as the monotonic relationship between the performance drop of downstream tasks across the difficulty spectrum, as we prune more pre-trained weights by magnitude.

Quantization

Paper
Code

Safe and Robust Watermark Injection with a Single OoD Image

1 code implementation • 4 Sep 2023 • Shuyang Yu, Junyuan Hong, Haobo Zhang, Haotao Wang, Zhangyang Wang, Jiayu Zhou

Training a high-performance deep neural network requires large amounts of data and computational resources.

Model extraction

Paper
Code

Enhancing NeRF akin to Enhancing LLMs: Generalizable NeRF Transformer with Mixture-of-View-Experts

1 code implementation • ICCV 2023 • Wenyan Cong, Hanxue Liang, Peihao Wang, Zhiwen Fan, Tianlong Chen, Mukund Varma, Yi Wang, Zhangyang Wang

Cross-scene generalizable NeRF models, which can directly synthesize novel views of unseen scenes, have become a new spotlight of the NeRF field.

Novel View Synthesis

Paper
Code

Robust Mixture-of-Expert Training for Convolutional Neural Networks

1 code implementation • ICCV 2023 • Yihua Zhang, Ruisi Cai, Tianlong Chen, Guanhua Zhang, huan zhang, Pin-Yu Chen, Shiyu Chang, Zhangyang Wang, Sijia Liu

Since the lack of robustness has become one of the main hurdles for CNNs, in this paper we ask: How to adversarially robustify a CNN-based MoE model?

Adversarial Robustness

Paper
Code

How Does Pruning Impact Long-Tailed Multi-Label Medical Image Classifiers?

1 code implementation • 17 Aug 2023 • Gregory Holste, Ziyu Jiang, Ajay Jaiswal, Maria Hanna, Shlomo Minkowitz, Alan C. Legasto, Joanna G. Escalon, Sharon Steinberger, Mark Bittman, Thomas C. Shen, Ying Ding, Ronald M. Summers, George Shih, Yifan Peng, Zhangyang Wang

This work represents a first step toward understanding the impact of pruning on model behavior in deep long-tailed, multi-label medical image classification.

Image Classification Medical Image Classification

Paper
Code

INR-Arch: A Dataflow Architecture and Compiler for Arbitrary-Order Gradient Computations in Implicit Neural Representation Processing

1 code implementation • 11 Aug 2023 • Stefan Abi-Karam, Rishov Sarkar, Dejia Xu, Zhiwen Fan, Zhangyang Wang, Cong Hao

In this work, we introduce INR-Arch, a framework that transforms the computation graph of an nth-order gradient into a hardware-optimized dataflow architecture.

Meta-Learning

Paper
Code

Doubly Robust Instance-Reweighted Adversarial Training

no code implementations • 1 Aug 2023 • Daouda Sow, Sen Lin, Zhangyang Wang, Yingbin Liang

Experiments on standard classification datasets demonstrate that our proposed approach outperforms related state-of-the-art baseline methods in terms of average robust performance, and at the same time improves the robustness against attacks on the weakest data points.

Paper
Add Code

Reference-based Painterly Inpainting via Diffusion: Crossing the Wild Reference Domain Gap

no code implementations • 20 Jul 2023 • Dejia Xu, Xingqian Xu, Wenyan Cong, Humphrey Shi, Zhangyang Wang

We propose Reference-based Painterly Inpainting, a novel task that crosses the wild reference domain gap and implants novel objects into artworks.

Image Inpainting

Paper
Add Code

Physics-Driven Turbulence Image Restoration with Stochastic Refinement

1 code implementation • ICCV 2023 • Ajay Jaiswal, Xingguang Zhang, Stanley H. Chan, Zhangyang Wang

Although fast and physics-grounded simulation tools have been introduced to help the deep-learning models adapt to real-world turbulence conditions recently, the training of such models only relies on the synthetic data and ground truth pairs.

Image Restoration

Paper
Code

Polynomial Width is Sufficient for Set Representation with High-dimensional Features

no code implementations • 8 Jul 2023 • Peihao Wang, Shenghao Yang, Shu Li, Zhangyang Wang, Pan Li

To investigate the minimal value of $L$ that achieves sufficient expressive power, we present two set-element embedding layers: (a) linear + power activation (LP) and (b) linear + exponential activations (LE).

Inductive Bias

Paper
Add Code

Zero-Shot Neural Architecture Search: Challenges, Solutions, and Opportunities

1 code implementation • 5 Jul 2023 • Guihong Li, Duc Hoang, Kartikeya Bhardwaj, Ming Lin, Zhangyang Wang, Radu Marculescu

Recently, zero-shot (or training-free) Neural Architecture Search (NAS) approaches have been proposed to liberate NAS from the expensive training process.

Neural Architecture Search

Paper
Code

FarSight: A Physics-Driven Whole-Body Biometric System at Large Distance and Altitude

no code implementations • 29 Jun 2023 • Feng Liu, Ryan Ashbaugh, Nicholas Chimitt, Najmul Hassan, Ali Hassani, Ajay Jaiswal, Minchul Kim, Zhiyuan Mao, Christopher Perry, Zhiyuan Ren, Yiyang Su, Pegah Varghaei, Kai Wang, Xingguang Zhang, Stanley Chan, Arun Ross, Humphrey Shi, Zhangyang Wang, Anil Jain, Xiaoming Liu

Whole-body biometric recognition is an important area of research due to its vast applications in law enforcement, border security, and surveillance.

Image Restoration TAR

Paper
Add Code

H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models

1 code implementation • 24 Jun 2023 • Zhenyu Zhang, Ying Sheng, Tianyi Zhou, Tianlong Chen, Lianmin Zheng, Ruisi Cai, Zhao Song, Yuandong Tian, Christopher Ré, Clark Barrett, Zhangyang Wang, Beidi Chen

Based on these insights, we propose Heavy Hitter Oracle (H$_2$O), a KV cache eviction policy that dynamically retains a balance of recent and H$_2$ tokens.

271

Paper
Code

Graph Ladling: Shockingly Simple Parallel GNN Training without Intermediate Communication

1 code implementation • 18 Jun 2023 • Ajay Jaiswal, Shiwei Liu, Tianlong Chen, Ying Ding, Zhangyang Wang

By dividing giant graph data, we build multiple independently and parallelly trained weaker GNNs (soup ingredient) without any intermediate communication, and combine their strength using a greedy interpolation soup procedure to achieve state-of-the-art performance.

graph partitioning Graph Sampling

Paper
Code

Instant Soup: Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models

1 code implementation • 18 Jun 2023 • Ajay Jaiswal, Shiwei Liu, Tianlong Chen, Ying Ding, Zhangyang Wang

Motivated by the recent observations of model soups, which suggest that fine-tuned weights of multiple models can be merged to a better minima, we propose Instant Soup Pruning (ISP) to generate lottery ticket quality subnetworks, using a fraction of the original IMP cost by replacing the expensive intermediate pruning stages of IMP with computationally efficient weak mask generation and aggregation routine.

Paper
Code

Learning to Estimate 6DoF Pose from Limited Data: A Few-Shot, Generalizable Approach using RGB Images

1 code implementation • 13 Jun 2023 • Panwang Pan, Zhiwen Fan, Brandon Y. Feng, Peihao Wang, Chenxin Li, Zhangyang Wang

The accurate estimation of six degrees-of-freedom (6DoF) object poses is essential for many applications in robotics and augmented reality.

object-detection Object Detection +1

Paper
Code

Edge-MoE: Memory-Efficient Multi-Task Vision Transformer Architecture with Task-level Sparsity via Mixture-of-Experts

1 code implementation • 30 May 2023 • Rishov Sarkar, Hanxue Liang, Zhiwen Fan, Zhangyang Wang, Cong Hao

Computer vision researchers are embracing two promising paradigms: Vision Transformers (ViTs) and Multi-task Learning (MTL), which both show great performance but are computation-intensive, given the quadratic complexity of self-attention in ViT and the need to activate an entire large MTL model for one task.

Multi-Task Learning

Paper
Code

Are Large Kernels Better Teachers than Transformers for ConvNets?

1 code implementation • 30 May 2023 • Tianjin Huang, Lu Yin, Zhenyu Zhang, Li Shen, Meng Fang, Mykola Pechenizkiy, Zhangyang Wang, Shiwei Liu

We hereby carry out a first-of-its-kind study unveiling that modern large-kernel ConvNets, a compelling competitor to Vision Transformers, are remarkably more effective teachers for small-kernel ConvNets, due to more similar architectures.

Knowledge Distillation

254

Paper
Code

Towards Constituting Mathematical Structures for Learning to Optimize

1 code implementation • 29 May 2023 • Jialin Liu, Xiaohan Chen, Zhangyang Wang, Wotao Yin, HanQin Cai

Learning to Optimize (L2O), a technique that utilizes machine learning to learn an optimization algorithm automatically from data, has gained arising attention in recent years.

Paper
Code

Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models

1 code implementation • 25 May 2023 • Xingqian Xu, Jiayi Guo, Zhangyang Wang, Gao Huang, Irfan Essa, Humphrey Shi

Text-to-image (T2I) research has grown explosively in the past year, owing to the large-scale pre-trained diffusion models and many emerging personalization and editing approaches.

Conditional Text-to-Image Synthesis Image Generation +3

705

Paper
Code

POPE: 6-DoF Promptable Pose Estimation of Any Object, in Any Scene, with One Reference

1 code implementation • 25 May 2023 • Zhiwen Fan, Panwang Pan, Peihao Wang, Yifan Jiang, Dejia Xu, Hanwen Jiang, Zhangyang Wang

To mitigate this issue, we propose a general paradigm for object pose estimation, called Promptable Object Pose Estimation (POPE).

Object Pose Estimation

116

Paper
Code

MMG-Ego4D: Multi-Modal Generalization in Egocentric Action Recognition

no code implementations • 12 May 2023 • Xinyu Gong, Sreyas Mohan, Naina Dhingra, Jean-Charles Bazin, Yilei Li, Zhangyang Wang, Rakesh Ranjan

In this paper, we study a novel problem in egocentric action recognition, which we term as "Multimodal Generalization" (MMG).

Action Recognition Egocentric Activity Recognition +1

Paper
Add Code

In-Context Learning Unlocked for Diffusion Models

1 code implementation • NeurIPS 2023 • Zhendong Wang, Yifan Jiang, Yadong Lu, Yelong Shen, Pengcheng He, Weizhu Chen, Zhangyang Wang, Mingyuan Zhou

We present Prompt Diffusion, a framework for enabling in-context learning in diffusion-based generative models.

In-Context Learning text-guided-image-editing

349

Paper
Code

Outline, Then Details: Syntactically Guided Coarse-To-Fine Code Generation

1 code implementation • 28 Apr 2023 • Wenqing Zheng, S P Sharan, Ajay Kumar Jaiswal, Kevin Wang, Yihan Xi, Dejia Xu, Zhangyang Wang

For a complicated algorithm, its implementation by a human programmer usually starts with outlining a rough control flow followed by iterative enrichments, eventually yielding carefully generated syntactic structures and variables in a hierarchy.

Code Generation Language Modelling +1

Paper
Code

Patch Diffusion: Faster and More Data-Efficient Training of Diffusion Models

1 code implementation • NeurIPS 2023 • Zhendong Wang, Yifan Jiang, Huangjie Zheng, Peihao Wang, Pengcheng He, Zhangyang Wang, Weizhu Chen, Mingyuan Zhou

Patch Diffusion meanwhile improves the performance of diffusion models trained on relatively small datasets, $e. g.$, as few as 5, 000 images to train from scratch.

Paper
Code

AutoShot: A Short Video Dataset and State-of-the-Art Shot Boundary Detection

1 code implementation • Submitted to ICLR 2022 • Wentao Zhu, Yufang Huang, Xiufeng Xie, Wenxian Liu, Jincan Deng, Debing Zhang, Zhangyang Wang, Ji Liu

For video content creation and understanding, the shot boundary detection (SBD) is one of the most essential components in various scenarios.

Ranked #1 on Camera shot boundary detection on ClipShots

Boundary Detection Neural Architecture Search

Paper
Code

PAIR-Diffusion: A Comprehensive Multimodal Object-Level Image Editor

1 code implementation • 30 Mar 2023 • Vidit Goel, Elia Peruzzo, Yifan Jiang, Dejia Xu, Xingqian Xu, Nicu Sebe, Trevor Darrell, Zhangyang Wang, Humphrey Shi

We propose PAIR Diffusion, a generic framework that can enable a diffusion model to control the structure and appearance properties of each object in the image.

Object

467

Paper
Code

Forget-Me-Not: Learning to Forget in Text-to-Image Diffusion Models

1 code implementation • 30 Mar 2023 • Eric Zhang, Kai Wang, Xingqian Xu, Zhangyang Wang, Humphrey Shi

The unlearning problem of deep learning models, once primarily an academic concern, has become a prevalent issue in the industry.

Disentanglement Memorization +1

Paper
Code

Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators

1 code implementation • ICCV 2023 • Levon Khachatryan, Andranik Movsisyan, Vahram Tadevosyan, Roberto Henschel, Zhangyang Wang, Shant Navasardyan, Humphrey Shi

Recent text-to-video generation approaches rely on computationally heavy training and require large-scale video datasets.

Image Generation Text-to-Video Generation +3

3,789

Paper
Code

Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!

1 code implementation • 3 Mar 2023 • Shiwei Liu, Tianlong Chen, Zhenyu Zhang, Xuxi Chen, Tianjin Huang, Ajay Jaiswal, Zhangyang Wang

In pursuit of a more general evaluation and unveiling the true potential of sparse algorithms, we introduce "Sparsity May Cry" Benchmark (SMC-Bench), a collection of carefully-curated 4 diverse tasks with 10 datasets, that accounts for capturing a wide range of domain-specific and sophisticated knowledge.

Paper
Code

Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers

1 code implementation • 2 Mar 2023 • Tianlong Chen, Zhenyu Zhang, Ajay Jaiswal, Shiwei Liu, Zhangyang Wang

Despite their remarkable achievement, gigantic transformers encounter significant drawbacks, including exorbitant computational and memory footprints during training, as well as severe collapse evidenced by a high degree of parameter redundancy.

Paper
Code

Learning to Grow Pretrained Models for Efficient Transformer Training

no code implementations • 2 Mar 2023 • Peihao Wang, Rameswar Panda, Lucas Torroba Hennigen, Philip Greengard, Leonid Karlinsky, Rogerio Feris, David Daniel Cox, Zhangyang Wang, Yoon Kim

Scaling transformers has led to significant breakthroughs in many domains, leading to a paradigm in which larger versions of existing models are trained and released on a periodic basis.

Paper
Add Code

M-L2O: Towards Generalizable Learning-to-Optimize by Test-Time Fast Self-Adaptation

1 code implementation • 28 Feb 2023 • Junjie Yang, Xuxi Chen, Tianlong Chen, Zhangyang Wang, Yingbin Liang

This data-driven procedure yields L2O that can efficiently solve problems similar to those seen in training, that is, drawn from the same ``task distribution".

Paper
Code

Layer Grafted Pre-training: Bridging Contrastive Learning And Masked Image Modeling For Label-Efficient Representations

1 code implementation • 27 Feb 2023 • Ziyu Jiang, Yinpeng Chen, Mengchen Liu, Dongdong Chen, Xiyang Dai, Lu Yuan, Zicheng Liu, Zhangyang Wang

This motivates us to shift the paradigm from combining loss at the end, to choosing the proper learning method per network layer.

Contrastive Learning Few-Shot Learning

Paper
Code

You Only Transfer What You Share: Intersection-Induced Graph Transfer Learning for Link Prediction

1 code implementation • 27 Feb 2023 • Wenqing Zheng, Edward W Huang, Nikhil Rao, Zhangyang Wang, Karthik Subbian

We identify this setting as Graph Intersection-induced Transfer Learning (GITL), which is motivated by practical applications in e-commerce or academic co-authorship predictions.

Link Prediction Transfer Learning

Paper
Code

Robust Weight Signatures: Gaining Robustness as Easy as Patching Weights?

1 code implementation • 24 Feb 2023 • Ruisi Cai, Zhenyu Zhang, Zhangyang Wang

Given a robust model trained to be resilient to one or multiple types of distribution shifts (e. g., natural image corruptions), how is that "robustness" encoded in the model weights, and how easily can it be disentangled and/or "zero-shot" transferred to some other models?

Paper
Code

Learning to Generalize Provably in Learning to Optimize

1 code implementation • 22 Feb 2023 • Junjie Yang, Tianlong Chen, Mingkang Zhu, Fengxiang He, DaCheng Tao, Yingbin Liang, Zhangyang Wang

While the optimizer generalization has been recently studied, the optimizee generalization (or learning to generalize) has not been rigorously studied in the L2O context, which is the aim of this paper.

247

Paper
Code

Ten Lessons We Have Learned in the New "Sparseland": A Short Handbook for Sparse Neural Network Researchers

no code implementations • 6 Feb 2023 • Shiwei Liu, Zhangyang Wang

In response, we summarize ten Q\&As of SNNs from many key aspects, including dense vs. sparse, unstructured sparse vs. structured sparse, pruning vs. sparse training, dense-to-sparse training vs. sparse-to-sparse training, static sparsity vs. dynamic sparsity, before-training/during-training vs. post-training sparsity, and many more.

General Knowledge

Paper
Add Code

Turning the Curse of Heterogeneity in Federated Learning into a Blessing for Out-of-Distribution Detection

1 code implementation • ICLR 2023 • Shuyang Yu, Junyuan Hong, Haotao Wang, Zhangyang Wang, Jiayu Zhou

We propose to take advantage of such heterogeneity and turn the curse into a blessing that facilitates OoD detection in FL.

Autonomous Driving Federated Learning +1

Paper
Code

Evaluate underdiagnosis and overdiagnosis bias of deep learning model on primary open-angle glaucoma diagnosis in under-served patient populations

no code implementations • 26 Jan 2023 • Mingquan Lin, Yuyun Xiao, BoJian Hou, Tingyi Wanyan, Mohit Manoj Sharma, Zhangyang Wang, Fei Wang, Sarah Van Tassel, Yifan Peng

In the United States, primary open-angle glaucoma (POAG) is the leading cause of blindness, especially among African American and Hispanic individuals.

Paper
Add Code

Specialist Diffusion: Plug-and-Play Sample-Efficient Fine-Tuning of Text-to-Image Diffusion Models To Learn Any Unseen Style

no code implementations • CVPR 2023 • Haoming Lu, Hazarapet Tunanyan, Kai Wang, Shant Navasardyan, Zhangyang Wang, Humphrey Shi

Diffusion models have demonstrated impressive capability of text-conditioned image synthesis, and broader application horizons are emerging by personalizing those pretrained diffusion models toward generating some specialized target object or style.

Disentanglement Image Generation

Paper
Add Code

MMG-Ego4D: Multimodal Generalization in Egocentric Action Recognition

no code implementations • CVPR 2023 • Xinyu Gong, Sreyas Mohan, Naina Dhingra, Jean-Charles Bazin, Yilei Li, Zhangyang Wang, Rakesh Ranjan

In this paper, we study a novel problem in egocentric action recognition, which we term as "Multimodal Generalization" (MMG).

Action Recognition Zero-shot Generalization

Paper
Add Code

Vision HGNN: An Image is More than a Graph of Nodes

1 code implementation • ICCV 2023 • Yan Han, Peihao Wang, Souvik Kundu, Ying Ding, Zhangyang Wang

In this paper, we enhance ViG by transcending conventional "pairwise" linkages and harnessing the power of the hypergraph to encapsulate image information.

graph construction Image Classification +2

Paper
Code

AdaMV-MoE: Adaptive Multi-Task Vision Mixture-of-Experts

1 code implementation • ICCV 2023 • Tianlong Chen, Xuxi Chen, Xianzhi Du, Abdullah Rashwan, Fan Yang, Huizhong Chen, Zhangyang Wang, Yeqing Li

Instead of compressing multiple tasks' knowledge into a single model, MoE separates the parameter space and only utilizes the relevant model pieces given task type and its input, which provides stabilized MTL training and ultra-efficient inference.

Instance Segmentation Multi-Task Learning +3

32,745

Paper
Code

NeuralLift-360: Lifting an In-the-Wild 2D Photo to a 3D Object With 360deg Views

no code implementations • CVPR 2023 • Dejia Xu, Yifan Jiang, Peihao Wang, Zhiwen Fan, Yi Wang, Zhangyang Wang

In this work, we study the challenging task of lifting a single image to a 3D object and, for the first time, demonstrate the ability to generate a plausible 3D object with 360deg views that corresponds well with the given reference image.

Denoising Depth Estimation

Paper
Add Code

Pruning Before Training May Improve Generalization, Provably

no code implementations • 1 Jan 2023 • Hongru Yang, Yingbin Liang, Xiaojie Guo, Lingfei Wu, Zhangyang Wang

It is shown that as long as the pruning fraction is below a certain threshold, gradient descent can drive the training loss toward zero and the network exhibits good generalization performance.

Network Pruning

Paper
Add Code

Convergence and Generalization of Wide Neural Networks with Large Bias

no code implementations • 1 Jan 2023 • Hongru Yang, Ziyu Jiang, Ruizhe Zhang, Zhangyang Wang, Yingbin Liang

This work studies training one-hidden-layer overparameterized ReLU networks via gradient descent in the neural tangent kernel (NTK) regime, where the networks' biases are initialized to some constant rather than zero.

Paper
Add Code

Symbolic Visual Reinforcement Learning: A Scalable Framework with Object-Level Abstraction and Differentiable Expression Search

1 code implementation • 30 Dec 2022 • Wenqing Zheng, S P Sharan, Zhiwen Fan, Kevin Wang, Yihan Xi, Zhangyang Wang

Learning efficient and interpretable policies has been a challenging task in reinforcement learning (RL), particularly in the visual RL setting with complex scenes.

Reinforcement Learning (RL)

Paper
Code

Attend Who is Weak: Pruning-assisted Medical Image Localization under Sophisticated and Implicit Imbalances

no code implementations • 6 Dec 2022 • Ajay Jaiswal, Tianlong Chen, Justin F. Rousseau, Yifan Peng, Ying Ding, Zhangyang Wang

However, DNNs are notoriously fragile to the class imbalance in image classification.

Image Classification Network Pruning

Paper
Add Code

StegaNeRF: Embedding Invisible Information within Neural Radiance Fields

no code implementations • ICCV 2023 • Chenxin Li, Brandon Y. Feng, Zhiwen Fan, Panwang Pan, Zhangyang Wang

Recent advances in neural rendering imply a future of widespread visual data distributions through sharing NeRF model weights.

Neural Rendering

Paper
Add Code

NeuralLift-360: Lifting An In-the-wild 2D Photo to A 3D Object with 360° Views

1 code implementation • 29 Nov 2022 • Dejia Xu, Yifan Jiang, Peihao Wang, Zhiwen Fan, Yi Wang, Zhangyang Wang

In this work, we study the challenging task of lifting a single image to a 3D object and, for the first time, demonstrate the ability to generate a plausible 3D object with 360{\deg} views that correspond well with the given reference image.

3D Reconstruction Image to 3D +3

306

Paper
Code

You Can Have Better Graph Neural Networks by Not Training Weights at All: Finding Untrained GNNs Tickets

1 code implementation • 28 Nov 2022 • Tianjin Huang, Tianlong Chen, Meng Fang, Vlado Menkovski, Jiaxu Zhao, Lu Yin, Yulong Pei, Decebal Constantin Mocanu, Zhangyang Wang, Mykola Pechenizkiy, Shiwei Liu

Recent works have impressively demonstrated that there exists a subnetwork in randomly initialized convolutional neural networks (CNNs) that can match the performance of the fully trained dense networks at initialization, without any optimization of the weights of the network (i. e., untrained networks).

Out-of-Distribution Detection

Paper
Code

Search Behavior Prediction: A Hypergraph Perspective

1 code implementation • 23 Nov 2022 • Yan Han, Edward W Huang, Wenqing Zheng, Nikhil Rao, Zhangyang Wang, Karthik Subbian

With these hyperedges, we augment the original bipartite graph into a new \textit{hypergraph}.

Link Prediction

Paper
Code

Peeling the Onion: Hierarchical Reduction of Data Redundancy for Efficient Vision Transformer Training

1 code implementation • 19 Nov 2022 • Zhenglun Kong, Haoyu Ma, Geng Yuan, Mengshu Sun, Yanyue Xie, Peiyan Dong, Xin Meng, Xuan Shen, Hao Tang, Minghai Qin, Tianlong Chen, Xiaolong Ma, Xiaohui Xie, Zhangyang Wang, Yanzhi Wang

Vision transformers (ViTs) have recently obtained success in many applications, but their intensive computation and heavy memory usage at both training and inference time limit their generalization.

Paper
Code

AligNeRF: High-Fidelity Neural Radiance Fields via Alignment-Aware Training

no code implementations • CVPR 2023 • Yifan Jiang, Peter Hedman, Ben Mildenhall, Dejia Xu, Jonathan T. Barron, Zhangyang Wang, Tianfan Xue

Neural Radiance Fields (NeRFs) are a powerful representation for modeling a 3D scene as a continuous function.

Camera Calibration Vocal Bursts Intensity Prediction

Paper
Add Code

Versatile Diffusion: Text, Images and Variations All in One Diffusion Model

3 code implementations • ICCV 2023 • Xingqian Xu, Zhangyang Wang, Eric Zhang, Kai Wang, Humphrey Shi

In this work, we expand the existing single-flow diffusion pipeline into a multi-task multimodal network, dubbed Versatile Diffusion (VD), that handles multiple flows of text-to-image, image-to-text, and variations in one unified model.

Disentanglement Image Captioning +5

22,417

Paper
Code

StyleNAT: Giving Each Head a New Perspective

2 code implementations • 10 Nov 2022 • Steven Walton, Ali Hassani, Xingqian Xu, Zhangyang Wang, Humphrey Shi

Image generation has been a long sought-after but challenging task, and performing the generation task in an efficient manner is similarly difficult.

Ranked #2 on Image Generation on FFHQ 256 x 256

Face Generation

993

Paper
Code

QuanGCN: Noise-Adaptive Training for Robust Quantum Graph Convolutional Networks

no code implementations • 9 Nov 2022 • Kaixiong Zhou, Zhenyu Zhang, Shengyuan Chen, Tianlong Chen, Xiao Huang, Zhangyang Wang, Xia Hu

Quantum neural networks (QNNs), an interdisciplinary field of quantum computing and machine learning, have attracted tremendous research interests due to the specific quantum advantages.

Paper
Add Code

Scaling Multimodal Pre-Training via Cross-Modality Gradient Harmonization

no code implementations • 3 Nov 2022 • Junru Wu, Yi Liang, Feng Han, Hassan Akbari, Zhangyang Wang, Cong Yu

For example, even in the commonly adopted instructional videos, a speaker can sometimes refer to something that is not visually present in the current frame; and the semantic misalignment would only be more unpredictable for the raw videos from the internet.

Contrastive Learning

Paper
Add Code

Sparse Winning Tickets are Data-Efficient Image Recognizers

1 code implementation • NIPS 2022 • Mukund Varma T, Xuxi Chen, Zhenyu Zhang, Tianlong Chen, Subhashini Venugopalan, Zhangyang Wang

Improving the performance of deep networks in data-limited regimes has warranted much attention.

Paper
Code

M$^3$ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with Model-Accelerator Co-design

1 code implementation • 26 Oct 2022 • Hanxue Liang, Zhiwen Fan, Rishov Sarkar, Ziyu Jiang, Tianlong Chen, Kai Zou, Yu Cheng, Cong Hao, Zhangyang Wang

However, when deploying MTL onto those real-world systems that are often resource-constrained or latency-sensitive, two prominent challenges arise: (i) during training, simultaneously optimizing all tasks is often difficult due to gradient conflicts across tasks; (ii) at inference, current MTL regimes have to activate nearly the entire model even to just execute a single task.

Multi-Task Learning

Paper
Code

Symbolic Distillation for Learned TCP Congestion Control

1 code implementation • 24 Oct 2022 • S P Sharan, Wenqing Zheng, Kuo-Feng Hsu, Jiarong Xing, Ang Chen, Zhangyang Wang

At the core of our proposal is a novel symbolic branching algorithm that enables the rule to be aware of the context in terms of various network conditions, eventually converting the NN policy into a symbolic tree.

Reinforcement Learning (RL)

Paper
Code

Signal Processing for Implicit Neural Representations

no code implementations • 17 Oct 2022 • Dejia Xu, Peihao Wang, Yifan Jiang, Zhiwen Fan, Zhangyang Wang

We answer this question by proposing an implicit neural signal processing network, dubbed INSP-Net, via differential operators on INR.

Deblurring Denoising +1

Paper
Add Code

Data-Model-Circuit Tri-Design for Ultra-Light Video Intelligence on Edge Devices

no code implementations • 16 Oct 2022 • Yimeng Zhang, Akshay Karkal Kamath, Qiucheng Wu, Zhiwen Fan, Wuyang Chen, Zhangyang Wang, Shiyu Chang, Sijia Liu, Cong Hao

In this paper, we propose a data-model-hardware tri-design framework for high-throughput, low-cost, and high-accuracy multi-object tracking (MOT) on High-Definition (HD) video stream.

Model Compression Multi-Object Tracking

Paper
Add Code

RoS-KD: A Robust Stochastic Knowledge Distillation Approach for Noisy Medical Imaging

no code implementations • 15 Oct 2022 • Ajay Jaiswal, Kumar Ashutosh, Justin F Rousseau, Yifan Peng, Zhangyang Wang, Ying Ding

Our extensive experiments on popular medical imaging classification tasks (cardiopulmonary disease and lesion classification) using real-world datasets, show the performance benefit of RoS-KD, its ability to distill knowledge from many popular large networks (ResNet-50, DenseNet-121, MobileNet-V2) in a comparatively small network, and its robustness to adversarial attacks (PGD, FSGM).

Classification Knowledge Distillation +1

Paper
Add Code

A Comprehensive Study on Large-Scale Graph Training: Benchmarking and Rethinking

2 code implementations • 14 Oct 2022 • Keyu Duan, Zirui Liu, Peihao Wang, Wenqing Zheng, Kaixiong Zhou, Tianlong Chen, Xia Hu, Zhangyang Wang

Large-scale graph training is a notoriously challenging problem for graph neural networks (GNNs).

Ranked #2 on Node Property Prediction on ogbn-products

Benchmarking Node Classification +1

Paper
Code

Old can be Gold: Better Gradient Flow can Make Vanilla-GCNs Great Again

1 code implementation • 14 Oct 2022 • Ajay Jaiswal, Peihao Wang, Tianlong Chen, Justin F. Rousseau, Ying Ding, Zhangyang Wang

In this paper, firstly, we provide a new perspective of gradient flow to understand the substandard performance of deep GCNs and hypothesize that by facilitating healthy gradient flow, we can significantly improve their trainability, as well as achieve state-of-the-art (SOTA) level performance from vanilla-GCNs.

Paper
Code

Trap and Replace: Defending Backdoor Attacks by Trapping Them into an Easy-to-Replace Subnetwork

1 code implementation • 12 Oct 2022 • Haotao Wang, Junyuan Hong, Aston Zhang, Jiayu Zhou, Zhangyang Wang

As a result, both the stem and the classification head in the final network are hardly affected by backdoor training samples.

backdoor defense Classification +1

Paper
Code

Augmentations in Hypergraph Contrastive Learning: Fabricated and Generative

1 code implementation • 7 Oct 2022 • Tianxin Wei, Yuning You, Tianlong Chen, Yang shen, Jingrui He, Zhangyang Wang

This paper targets at improving the generalizability of hypergraph neural networks in the low-label regime, through applying the contrastive learning approach from images/graphs (we refer to it as HyperGCL).

Contrastive Learning Fairness +1

Paper
Code

DynImp: Dynamic Imputation for Wearable Sensing Data Through Sensory and Temporal Relatedness

no code implementations • 26 Sep 2022 • Zepeng Huo, Taowei Ji, Yifei Liang, Shuai Huang, Zhangyang Wang, Xiaoning Qian, Bobak Mortazavi

We argue that traditional methods have rarely made use of both times-series dynamics of the data as well as the relatedness of the features from different sensors.

Activity Recognition Denoising +3

Paper
Add Code

NeRF-SOS: Any-View Self-supervised Object Segmentation on Complex Scenes

1 code implementation • 19 Sep 2022 • Zhiwen Fan, Peihao Wang, Yifan Jiang, Xinyu Gong, Dejia Xu, Zhangyang Wang

Our framework, called NeRF with Self-supervised Object Segmentation NeRF-SOS, couples object segmentation and neural radiance field to segment objects in any view within a scene.

Object Segmentation +2

124

Paper
Code

Can We Solve 3D Vision Tasks Starting from A 2D Vision Transformer?

2 code implementations • 15 Sep 2022 • Yi Wang, Zhiwen Fan, Tianlong Chen, Hehe Fan, Zhangyang Wang

Vision Transformers (ViTs) have proven to be effective, in solving 2D image understanding tasks by training over large-scale image datasets; and meanwhile as a somehow separate track, in modeling the 3D visual world too such as voxels or point clouds.

Point Cloud Segmentation

Paper
Code

Long-Tailed Classification of Thorax Diseases on Chest X-Ray: A New Benchmark Study

1 code implementation • 29 Aug 2022 • Gregory Holste, Song Wang, Ziyu Jiang, Thomas C. Shen, George Shih, Ronald M. Summers, Yifan Peng, Zhangyang Wang

Imaging exams, such as chest radiography, will yield a small set of common findings and a much larger set of uncommon findings.

Ranked #1 on Long-tail Learning on MIMIC-CXR-LT

Image Classification Long-tail Learning +1

Paper
Code

Auto-ViT-Acc: An FPGA-Aware Automatic Acceleration Framework for Vision Transformer with Mixed-Scheme Quantization

no code implementations • 10 Aug 2022 • Zhengang Li, Mengshu Sun, Alec Lu, Haoyu Ma, Geng Yuan, Yanyue Xie, Hao Tang, Yanyu Li, Miriam Leeser, Zhangyang Wang, Xue Lin, Zhenman Fang

Compared with state-of-the-art ViT quantization work (algorithmic approach only without hardware acceleration), our quantization achieves 0. 47% to 1. 36% higher Top-1 accuracy under the same bit-width.

Quantization

Paper
Add Code

Is Attention All That NeRF Needs?

1 code implementation • 27 Jul 2022 • Mukund Varma T, Peihao Wang, Xuxi Chen, Tianlong Chen, Subhashini Venugopalan, Zhangyang Wang

While prior works on NeRFs optimize a scene representation by inverting a handcrafted rendering equation, GNT achieves neural representation and rendering that generalizes across scenes using transformers at two stages.

Ranked #1 on Generalizable Novel View Synthesis on LLFF

Generalizable Novel View Synthesis Inductive Bias +1

327

Paper
Code

Self-supervised contrastive learning of echocardiogram videos enables label-efficient cardiac disease diagnosis

2 code implementations • 23 Jul 2022 • Gregory Holste, Evangelos K. Oikonomou, Bobak J. Mortazavi, Zhangyang Wang, Rohan Khera

Advances in self-supervised learning (SSL) have shown that self-supervised pretraining on medical imaging data can provide a strong initialization for downstream supervised classification and segmentation.

Classification Contrastive Learning +3

Paper
Code

Density-Aware Personalized Training for Risk Prediction in Imbalanced Medical Data

no code implementations • 23 Jul 2022 • Zepeng Huo, Xiaoning Qian, Shuai Huang, Zhangyang Wang, Bobak J. Mortazavi

Medical events of interest, such as mortality, often happen at a low rate in electronic medical records, as most admitted patients survive.

Paper
Add Code

Single Frame Atmospheric Turbulence Mitigation: A Benchmark Study and A New Physics-Inspired Transformer Model

1 code implementation • 20 Jul 2022 • Zhiyuan Mao, Ajay Jaiswal, Zhangyang Wang, Stanley H. Chan

Image restoration algorithms for atmospheric turbulence are known to be much more challenging to design than traditional ones such as blur or noise because the distortion caused by the turbulence is an entanglement of spatially varying blur, geometric distortion, and sensor noise.

Image Restoration SSIM

Paper
Code

Equivariant Hypergraph Diffusion Neural Operators

1 code implementation • 14 Jul 2022 • Peihao Wang, Shenghao Yang, Yunyu Liu, Zhangyang Wang, Pan Li

Hypergraph neural networks (HNNs) using neural networks to encode hypergraphs provide a promising way to model higher-order relations in data and further solve relevant prediction tasks built upon such higher-order relations.

Computational Efficiency Node Classification

Paper
Code

Radiomics-Guided Global-Local Transformer for Weakly Supervised Pathology Localization in Chest X-Rays

1 code implementation • 10 Jul 2022 • Yan Han, Gregory Holste, Ying Ding, Ahmed Tewfik, Yifan Peng, Zhangyang Wang

Using the learned self-attention of its image branch, RGT extracts a bounding box for which to compute radiomic features, which are further processed by the radiomics branch; learned image and radiomic features are then fused and mutually interact via cross-attention layers.

Paper
Code

Neural Implicit Dictionary via Mixture-of-Expert Training

1 code implementation • 8 Jul 2022 • Peihao Wang, Zhiwen Fan, Tianlong Chen, Zhangyang Wang

In this paper, we present a generic INR framework that achieves both data and training efficiency by learning a Neural Implicit Dictionary (NID) from a data collection and representing INR as a functional combination of basis sampled from the dictionary.

Image Inpainting

Paper
Code

More ConvNets in the 2020s: Scaling up Kernels Beyond 51x51 using Sparsity

1 code implementation • 7 Jul 2022 • Shiwei Liu, Tianlong Chen, Xiaohan Chen, Xuxi Chen, Qiao Xiao, Boqian Wu, Tommi Kärkkäinen, Mykola Pechenizkiy, Decebal Mocanu, Zhangyang Wang

Transformers have quickly shined in the computer vision world since the emergence of Vision Transformers (ViTs).

Object Detection Semantic Segmentation

254

Paper
Code

Partial and Asymmetric Contrastive Learning for Out-of-Distribution Detection in Long-Tailed Recognition

1 code implementation • 4 Jul 2022 • Haotao Wang, Aston Zhang, Yi Zhu, Shuai Zheng, Mu Li, Alex Smola, Zhangyang Wang

However, in real-world applications, it is common for the training sets to have long-tailed distributions.

Anomaly Detection Contrastive Learning +2

Paper
Code

Aug-NeRF: Training Stronger Neural Radiance Fields with Triple-Level Physically-Grounded Augmentations

1 code implementation • CVPR 2022 • Tianlong Chen, Peihao Wang, Zhiwen Fan, Zhangyang Wang

Inspired by that, we propose Augmented NeRF (Aug-NeRF), which for the first time brings the power of robust data augmentations into regularizing the NeRF training.

Novel View Synthesis Out-of-Distribution Generalization

123

Paper
Code

Removing Batch Normalization Boosts Adversarial Training

1 code implementation • 4 Jul 2022 • Haotao Wang, Aston Zhang, Shuai Zheng, Xingjian Shi, Mu Li, Zhangyang Wang

In addition, NoFrost achieves a $23. 56\%$ adversarial robustness against PGD attack, which improves the $13. 57\%$ robustness in BN-based AT.

Adversarial Robustness

Paper
Code

How Robust is Your Fairness? Evaluating and Sustaining Fairness under Unseen Distribution Shifts

no code implementations • 4 Jul 2022 • Haotao Wang, Junyuan Hong, Jiayu Zhou, Zhangyang Wang

Increasing concerns have been raised on deep learning fairness in recent years.

Fairness

Paper
Add Code

Training Your Sparse Neural Network Better with Any Mask

1 code implementation • 26 Jun 2022 • Ajay Jaiswal, Haoyu Ma, Tianlong Chen, Ying Ding, Zhangyang Wang

Pruning large neural networks to create high-quality, independently trainable sparse masks, which can maintain similar performance to their dense counterparts, is very desirable due to the reduced space and time complexity.

Paper
Code

Queried Unlabeled Data Improves and Robustifies Class-Incremental Learning

no code implementations • 15 Jun 2022 • Tianlong Chen, Sijia Liu, Shiyu Chang, Lisa Amini, Zhangyang Wang

Inspired by the recent success of learning robust models with unlabeled data, we explore a new robustness-aware CIL setting, where the learned adversarial robustness has to resist forgetting and be transferred as new tasks come in continually.

Adversarial Robustness Class Incremental Learning +1

Paper
Add Code

Linearity Grafting: Relaxed Neuron Pruning Helps Certifiable Robustness

1 code implementation • 15 Jun 2022 • Tianlong Chen, huan zhang, Zhenyu Zhang, Shiyu Chang, Sijia Liu, Pin-Yu Chen, Zhangyang Wang

Certifiable robustness is a highly desirable property for adopting deep neural networks (DNNs) in safety-critical scenarios, but often demands tedious computations to establish.

Paper
Code

Can pruning improve certified robustness of neural networks?

1 code implementation • 15 Jun 2022 • Zhangheng Li, Tianlong Chen, Linyi Li, Bo Li, Zhangyang Wang

Given the fact that neural networks are often over-parameterized, one effective way to reduce such computational overhead is neural network pruning, by removing redundant parameters from trained neural networks.

Network Pruning

Paper
Code

A Multi-purpose Realistic Haze Benchmark with Quantifiable Haze Levels and Ground Truth

no code implementations • 13 Jun 2022 • Priya Narayanan, Xin Hu, Zhenyu Wu, Matthew D Thielke, John G Rogers, Andre V Harrison, John A D'Agostino, James D Brown, Long P Quang, James R Uplinger, Heesung Kwon, Zhangyang Wang

The full dataset presented in this paper, including the ground truth object classification bounding boxes and haze density measurements, is provided for the community to evaluate their algorithms at: https://a2i2-archangel. vision.

Object object-detection +3

Paper
Add Code

VideoINR: Learning Video Implicit Neural Representation for Continuous Space-Time Super-Resolution

1 code implementation • CVPR 2022 • Zeyuan Chen, Yinbo Chen, Jingwen Liu, Xingqian Xu, Vidit Goel, Zhangyang Wang, Humphrey Shi, Xiaolong Wang

The learned implicit neural representation can be decoded to videos of arbitrary spatial resolution and frame rate.

Space-time Video Super-resolution Video Frame Interpolation +1

257

Paper
Code

DiSparse: Disentangled Sparsification for Multitask Model Compression

1 code implementation • CVPR 2022 • Xinglong Sun, Ali Hassani, Zhangyang Wang, Gao Huang, Humphrey Shi

We analyzed the pruning masks generated with DiSparse and observed strikingly similar sparse network architecture identified by each task even before the training starts.

Model Compression

Paper
Code

Data-Efficient Double-Win Lottery Tickets from Robust Pre-training

1 code implementation • 9 Jun 2022 • Tianlong Chen, Zhenyu Zhang, Sijia Liu, Yang Zhang, Shiyu Chang, Zhangyang Wang

For example, on downstream CIFAR-10/100 datasets, we identify double-win matching subnetworks with the standard, fast adversarial, and adversarial pre-training from ImageNet, at 89. 26%/73. 79%, 89. 26%/79. 03%, and 91. 41%/83. 22% sparsity, respectively.

Transfer Learning

Paper
Code

E^2VTS: Energy-Efficient Video Text Spotting from Unmanned Aerial Vehicles

1 code implementation • 5 Jun 2022 • Zhenyu Hu, Zhenyu Wu, Pengcheng Pi, Yunhe Xue, Jiayi Shen, Jianchao Tan, Xiangru Lian, Zhangyang Wang, Ji Liu

Unmanned Aerial Vehicles (UAVs) based video text spotting has been extensively used in civil and military domains.

Text Spotting

Paper
Code

Quarantine: Sparsity Can Uncover the Trojan Attack Trigger for Free

1 code implementation • CVPR 2022 • Tianlong Chen, Zhenyu Zhang, Yihua Zhang, Shiyu Chang, Sijia Liu, Zhangyang Wang

Trojan attacks threaten deep neural networks (DNNs) by poisoning them to behave normally on most samples, yet to produce manipulated results for inputs attached with a particular trigger.

Network Pruning

Paper
Code

Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis

2 code implementations • 11 May 2022 • Wuyang Chen, Wei Huang, Xinyu Gong, Boris Hanin, Zhangyang Wang

Advanced deep neural networks (DNNs), designed by either human or AutoML algorithms, are growing increasingly complex.

Neural Architecture Search

Paper
Code

Grasping the Arrow of Time from the Singularity: Decoding Micromotion in Low-dimensional Latent Spaces from StyleGAN

1 code implementation • 27 Apr 2022 • Qiucheng Wu, Yifan Jiang, Junru Wu, Kai Wang, Gong Zhang, Humphrey Shi, Zhangyang Wang, Shiyu Chang

To study the motion features in the latent space of StyleGAN, in this paper, we hypothesize and demonstrate that a series of meaningful, natural, and versatile small, local movements (referred to as "micromotion", such as expression, head movement, and aging effect) can be represented in low-rank spaces extracted from the latent space of a conventionally pre-trained StyleGAN-v2 model for face generation, with the guidance of proper "anchors" in the form of either short text or video clips.

Disentanglement Face Generation

Paper
Code

E^2TAD: An Energy-Efficient Tracking-based Action Detector

1 code implementation • 9 Apr 2022 • Xin Hu, Zhenyu Wu, Hao-Yu Miao, Siqi Fan, Taiyu Long, Zhenyu Hu, Pengcheng Pi, Yi Wu, Zhou Ren, Zhangyang Wang, Gang Hua

Video action detection (spatio-temporal action localization) is usually the starting point for human-centric intelligent analysis of videos nowadays.

Fine-Grained Action Detection object-detection +3

Paper
Code

Unified Implicit Neural Stylization

1 code implementation • 5 Apr 2022 • Zhiwen Fan, Yifan Jiang, Peihao Wang, Xinyu Gong, Dejia Xu, Zhangyang Wang

Representing visual signals by implicit representation (e. g., a coordinate based deep network) has prevailed among many vision tasks.

Neural Stylization Novel View Synthesis

107

Paper
Code

APP: Anytime Progressive Pruning

1 code implementation • 4 Apr 2022 • Diganta Misra, Bharat Runwal, Tianlong Chen, Zhangyang Wang, Irina Rish

With the latest advances in deep learning, there has been a lot of focus on the online learning paradigm due to its relevance in practical settings.

Network Pruning Sparse Learning

Paper
Code

SinNeRF: Training Neural Radiance Fields on Complex Scenes from a Single Image

1 code implementation • 2 Apr 2022 • Dejia Xu, Yifan Jiang, Peihao Wang, Zhiwen Fan, Humphrey Shi, Zhangyang Wang

Despite the rapid development of Neural Radiance Field (NeRF), the necessity of dense covers largely prohibits its wider applications.

Novel View Synthesis

325

Paper
Code

VFDS: Variational Foresight Dynamic Selection in Bayesian Neural Networks for Efficient Human Activity Recognition

no code implementations • 31 Mar 2022 • Randy Ardywibowo, Shahin Boluki, Zhangyang Wang, Bobak Mortazavi, Shuai Huang, Xiaoning Qian

At its core is an implicit variational distribution on binary gates that are dependent on previous observations, which will select the next subset of features to observe.

Human Activity Recognition

Paper
Add Code

On the Neural Tangent Kernel Analysis of Randomly Pruned Neural Networks

no code implementations • 27 Mar 2022 • Hongru Yang, Zhangyang Wang

It is shown that given a pruning probability, for fully-connected neural networks with the weights randomly pruned at the initialization, as the width of each layer grows to infinity sequentially, the NTK of the pruned neural network converges to the limiting NTK of the original network with some extra scaling.

Image Classification

Paper
Add Code

Efficient Split-Mix Federated Learning for On-Demand and In-Situ Customization

1 code implementation • ICLR 2022 • Junyuan Hong, Haotao Wang, Zhangyang Wang, Jiayu Zhou

In this paper, we propose a novel Split-Mix FL strategy for heterogeneous participants that, once training is done, provides in-situ customization of model sizes and robustness.

Personalized Federated Learning

Paper
Code

Unified Visual Transformer Compression

1 code implementation • ICLR 2022 • Shixing Yu, Tianlong Chen, Jiayi Shen, Huan Yuan, Jianchao Tan, Sen yang, Ji Liu, Zhangyang Wang

Vision transformers (ViTs) have gained popularity recently.

Knowledge Distillation

Paper
Code

Symbolic Learning to Optimize: Towards Interpretability and Scalability

1 code implementation • ICLR 2022 • Wenqing Zheng, Tianlong Chen, Ting-Kuei Hu, Zhangyang Wang

Recent studies on Learning to Optimize (L2O) suggest a promising path to automating and accelerating the optimization procedure for complicated tasks.

Symbolic Regression

Paper
Code

The Principle of Diversity: Training Stronger Vision Transformers Calls for Reducing All Levels of Redundancy

1 code implementation • CVPR 2022 • Tianlong Chen, Zhenyu Zhang, Yu Cheng, Ahmed Awadallah, Zhangyang Wang

However, a "head-to-toe assessment" regarding the extent of redundancy in ViTs, and how much we could gain by thoroughly mitigating such, has been absent for this field.

Paper
Code

Optimizer Amalgamation

1 code implementation • ICLR 2022 • Tianshu Huang, Tianlong Chen, Sijia Liu, Shiyu Chang, Lisa Amini, Zhangyang Wang

Selecting an appropriate optimizer for a given problem is of major interest for researchers and practitioners.

Paper
Code

Anti-Oversmoothing in Deep Vision Transformers via the Fourier Domain Analysis: From Theory to Practice

1 code implementation • 9 Mar 2022 • Peihao Wang, Wenqing Zheng, Tianlong Chen, Zhangyang Wang

The first technique, termed AttnScale, decomposes a self-attention block into low-pass and high-pass components, then rescales and combines these two filters to produce an all-pass self-attention matrix.

Paper
Code

Auto-scaling Vision Transformers without Training

1 code implementation • ICLR 2022 • Wuyang Chen, Wei Huang, Xianzhi Du, Xiaodan Song, Zhangyang Wang, Denny Zhou

The motivation comes from two pain spots: 1) the lack of efficient and principled methods for designing and scaling ViTs; 2) the tremendous computational cost of training ViT that is much heavier than its convolution counterpart.

Paper
Code

Sparsity Winning Twice: Better Robust Generalization from More Efficient Training

1 code implementation • ICLR 2022 • Tianlong Chen, Zhenyu Zhang, Pengjun Wang, Santosh Balachandra, Haoyu Ma, Zehao Wang, Zhangyang Wang

We introduce two alternatives for sparse adversarial training: (i) static sparsity, by leveraging recent results from the lottery ticket hypothesis to identify critical sparse subnetworks arising from the early training; (ii) dynamic sparsity, by allowing the sparse subnetwork to adaptively adjust its connectivity pattern (while sticking to the same sparsity ratio) throughout training.

Paper
Code

Coarsening the Granularity: Towards Structurally Sparse Lottery Tickets

1 code implementation • 9 Feb 2022 • Tianlong Chen, Xuxi Chen, Xiaolong Ma, Yanzhi Wang, Zhangyang Wang

The lottery ticket hypothesis (LTH) has shown that dense models contain highly sparse subnetworks (i. e., winning tickets) that can be trained in isolation to match full accuracy.

Paper
Code

The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training

1 code implementation • ICLR 2022 • Shiwei Liu, Tianlong Chen, Xiaohan Chen, Li Shen, Decebal Constantin Mocanu, Zhangyang Wang, Mykola Pechenizkiy

In this paper, we focus on sparse training and highlight a perhaps counter-intuitive finding, that random pruning at initialization can be quite powerful for the sparse training of modern neural networks.

Adversarial Robustness Out-of-Distribution Detection

Paper
Code

VAQF: Fully Automatic Software-Hardware Co-Design Framework for Low-Bit Vision Transformer

no code implementations • 17 Jan 2022 • Mengshu Sun, Haoyu Ma, Guoliang Kang, Yifan Jiang, Tianlong Chen, Xiaolong Ma, Zhangyang Wang, Yanzhi Wang

To the best of our knowledge, this is the first time quantization has been incorporated into ViT acceleration on FPGAs with the help of a fully automatic framework to guide the quantization strategy on the software side and the accelerator implementations on the hardware side given the target frame rate.

Quantization

Paper
Add Code

Bringing Your Own View: Graph Contrastive Learning without Prefabricated Data Augmentations

1 code implementation • 4 Jan 2022 • Yuning You, Tianlong Chen, Zhangyang Wang, Yang shen

Accordingly, we have extended the prefabricated discrete prior in the augmentation set, to a learnable continuous prior in the parameter space of graph generators, assuming that graph priors per se, similar to the concept of image manifolds, can be learned by data generation.

Contrastive Learning Graph Learning

105

Paper
Code

Fast and High-Quality Image Denoising via Malleable Convolutions

no code implementations • 2 Jan 2022 • Yifan Jiang, Bartlomiej Wronski, Ben Mildenhall, Jonathan T. Barron, Zhangyang Wang, Tianfan Xue

These spatially-varying kernels are produced by an efficient predictor network running on a downsampled input, making them much more efficient to compute than per-pixel kernels produced by a full-resolution image, and also enlarging the network's receptive field compared with static kernels.

Image Denoising Image Restoration +1

Paper
Add Code

CADTransformer: Panoptic Symbol Spotting Transformer for CAD Drawings

1 code implementation • CVPR 2022 • Zhiwen Fan, Tianlong Chen, Peihao Wang, Zhangyang Wang

CADTransformer tokenizes directly from the set of graphical primitives in CAD drawings, and correspondingly optimizes line-grained semantic and instance symbol spotting altogether by a pair of prediction heads.

Data Augmentation

Paper
Code

EI-CLIP: Entity-Aware Interventional Contrastive Learning for E-Commerce Cross-Modal Retrieval

no code implementations • CVPR 2022 • Haoyu Ma, Handong Zhao, Zhe Lin, Ajinkya Kale, Zhangyang Wang, Tong Yu, Jiuxiang Gu, Sunav Choudhary, Xiaohui Xie

recommendation, and marketing services.

Causal Inference Contrastive Learning +3

Paper
Add Code

Federated Dynamic Sparse Training: Computing Less, Communicating Less, Yet Learning Better

1 code implementation • 18 Dec 2021 • Sameer Bibikar, Haris Vikalo, Zhangyang Wang, Xiaohan Chen

Federated learning (FL) enables distribution of machine learning workloads from the cloud to resource-limited edge devices.

Federated Learning

Paper
Code

A Simple Single-Scale Vision Transformer for Object Localization and Instance Segmentation

3 code implementations • 17 Dec 2021 • Wuyang Chen, Xianzhi Du, Fan Yang, Lucas Beyer, Xiaohua Zhai, Tsung-Yi Lin, Huizhong Chen, Jing Li, Xiaodan Song, Zhangyang Wang, Denny Zhou

In this paper, we comprehensively study three architecture design choices on ViT -- spatial reduction, doubled channels, and multiscale features -- and demonstrate that a vanilla ViT architecture can fulfill this goal without handcrafting multiscale features, maintaining the original ViT design philosophy.

Image Classification Instance Segmentation +6

76,579

Paper
Code

Auto-X3D: Ultra-Efficient Video Understanding via Finer-Grained Neural Architecture Search

no code implementations • 9 Dec 2021 • Yifan Jiang, Xinyu Gong, Junru Wu, Humphrey Shi, Zhicheng Yan, Zhangyang Wang

Efficient video architecture is the key to deploying video recognition systems on devices with limited computing resources.

Neural Architecture Search Video Recognition +1

Paper
Add Code

Cold Brew: Distilling Graph Node Representations with Incomplete or Missing Neighborhoods

2 code implementations • ICLR 2022 • Wenqing Zheng, Edward W Huang, Nikhil Rao, Sumeet Katariya, Zhangyang Wang, Karthik Subbian

We propose Cold Brew, a teacher-student distillation approach to address the SCS and noisy-neighbor challenges for GNNs.

Node Classification

Paper
Code

Improving Contrastive Learning on Imbalanced Seed Data via Open-World Sampling

1 code implementation • NeurIPS 2021 • Ziyu Jiang, Tianlong Chen, Ting Chen, Zhangyang Wang

Contrastive learning approaches have achieved great success in learning visual representations with few labels of the target classes.

Attribute Contrastive Learning

Paper
Code

DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models

1 code implementation • 30 Oct 2021 • Xuxi Chen, Tianlong Chen, Weizhu Chen, Ahmed Hassan Awadallah, Zhangyang Wang, Yu Cheng

To address these pain points, we propose a framework for resource- and parameter-efficient fine-tuning by leveraging the sparsity prior in both weight updates and the final model weights.

Paper
Code

You are caught stealing my winning lottery ticket! Making a lottery ticket claim its ownership

1 code implementation • NeurIPS 2021 • Xuxi Chen, Tianlong Chen, Zhenyu Zhang, Zhangyang Wang

The lottery ticket hypothesis (LTH) emerges as a promising framework to leverage a special sparse subnetwork (i. e., winning ticket) instead of a full model for both training and inference, that can lower both costs without sacrificing the performance.

Paper
Code

Hyperparameter Tuning is All You Need for LISTA

1 code implementation • NeurIPS 2021 • Xiaohan Chen, Jialin Liu, Zhangyang Wang, Wotao Yin

Learned Iterative Shrinkage-Thresholding Algorithm (LISTA) introduces the concept of unrolling an iterative algorithm and training it like a neural network.

Rolling Shutter Correction

Paper
Code

Delayed Propagation Transformer: A Universal Computation Engine towards Practical Control in Cyber-Physical Systems

1 code implementation • NeurIPS 2021 • Wenqing Zheng, Qiangqiang Guo, Hao Yang, Peihao Wang, Zhangyang Wang

This paper presents the Delayed Propagation Transformer (DePT), a new transformer-based model that specializes in the global modeling of CPS while taking into account the immutable constraints from the physical world.

Inductive Bias

Paper
Code

AugMax: Adversarial Composition of Random Augmentations for Robust Training

1 code implementation • NeurIPS 2021 • Haotao Wang, Chaowei Xiao, Jean Kossaifi, Zhiding Yu, Anima Anandkumar, Zhangyang Wang

Diversity and hardness are two complementary dimensions of data augmentation to achieve robustness.

Data Augmentation

123

Paper
Code

Towards Lifelong Learning of Multilingual Text-To-Speech Synthesis

1 code implementation • 9 Oct 2021 • Mu Yang, Shaojin Ding, Tianlong Chen, Tong Wang, Zhangyang Wang

This work presents a lifelong learning approach to train a multilingual Text-To-Speech (TTS) system, where each language was seen as an individual task and was learned sequentially and continually.

Speech Synthesis Text-To-Speech Synthesis

Paper
Code

Universality of Winning Tickets: A Renormalization Group Perspective

no code implementations • 7 Oct 2021 • William T. Redman, Tianlong Chen, Zhangyang Wang, Akshunna S. Dogra

Foundational work on the Lottery Ticket Hypothesis has suggested an exciting corollary: winning tickets found in the context of one task can be transferred to similar tasks, possibly even across different architectures.

Paper
Add Code

Audio Lottery: Speech Recognition Made Ultra-Lightweight, Noise-Robust, and Transferable

no code implementations • ICLR 2022 • Shaojin Ding, Tianlong Chen, Zhangyang Wang

In this paper, we investigate the tantalizing possibility of using lottery ticket hypothesis to discover lightweight speech recognition models, that are (1) robust to various noise existing in speech; (2) transferable to fit the open-world personalization; and 3) compatible with structured sparsity.

speech-recognition Speech Recognition

Paper
Add Code

Universality of Deep Neural Network Lottery Tickets: A Renormalization Group Perspective

no code implementations • 29 Sep 2021 • William T Redman, Tianlong Chen, Akshunna S. Dogra, Zhangyang Wang

Paper
Add Code

Scaling the Depth of Vision Transformers via the Fourier Domain Analysis

no code implementations • ICLR 2022 • Peihao Wang, Wenqing Zheng, Tianlong Chen, Zhangyang Wang

Paper
Add Code

Reasoning With Hierarchical Symbols: Reclaiming Symbolic Policies For Visual Reinforcement Learning

no code implementations • 29 Sep 2021 • Wenqing Zheng, S P Sharan, Zhiwen Fan, Zhangyang Wang

Deep vision models are nowadays widely integrated into visual reinforcement learning (RL) to parameterize the policy networks.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Lottery Tickets can have Structural Sparsity

no code implementations • 29 Sep 2021 • Tianlong Chen, Xuxi Chen, Xiaolong Ma, Yanzhi Wang, Zhangyang Wang

The lottery ticket hypothesis (LTH) has shown that dense models contain highly sparse subnetworks (i. e., $\textit{winning tickets}$) that can be trained in isolation to match full accuracy.

Paper
Add Code

Learning Pruning-Friendly Networks via Frank-Wolfe: One-Shot, Any-Sparsity, And No Retraining

1 code implementation • ICLR 2022 • Lu Miao, Xiaolong Luo, Tianlong Chen, Wuyang Chen, Dong Liu, Zhangyang Wang

Conventional methods often require (iterative) pruning followed by re-training, which not only incurs large overhead beyond the original DNN training but also can be sensitive to retraining hyperparameters.

Paper
Code

Bayesian Modeling and Uncertainty Quantification for Learning to Optimize: What, Why, and How

no code implementations • ICLR 2022 • Yuning You, Yue Cao, Tianlong Chen, Zhangyang Wang, Yang shen

Optimizing an objective function with uncertainty awareness is well-known to improve the accuracy and confidence of optimization solutions.

Uncertainty Quantification Variational Inference

Paper
Add Code

AutoCoG: A Unified Data-Modal Co-Search Framework for Graph Neural Networks

no code implementations • 29 Sep 2021 • Duc N.M Hoang, Kaixiong Zhou, Tianlong Chen, Xia Hu, Zhangyang Wang

Despite the preliminary success, we argue that for GNNs, NAS has to be customized further, due to the topological complicacy of GNN input data (graph) as well as the notorious training instability.

Data Augmentation Language Modelling +1

Paper
Add Code

Stingy Teacher: Sparse Logits Suffice to Fail Knowledge Distillation

no code implementations • 29 Sep 2021 • Haoyu Ma, Yifan Huang, Tianlong Chen, Hao Tang, Chenyu You, Zhangyang Wang, Xiaohui Xie

However, it is unclear why the distorted distribution of the logits is catastrophic to the student model.

Knowledge Distillation

Paper
Add Code

Generalizable Learning to Optimize into Wide Valleys

no code implementations • 29 Sep 2021 • Junjie Yang, Tianlong Chen, Mingkang Zhu, Fengxiang He, DaCheng Tao, Yingbin Liang, Zhangyang Wang

Learning to optimize (L2O) has gained increasing popularity in various optimization tasks, since classical optimizers usually require laborious, problem-specific design and hyperparameter tuning.

Paper
Add Code

CheXT: Knowledge-Guided Cross-Attention Transformer for Abnormality Classification and Localization in Chest X-rays

no code implementations • 29 Sep 2021 • Yan Han, Ying Ding, Ahmed Tewfik, Yifan Peng, Zhangyang Wang

During training, the image branch leverages its learned attention to estimate pathology localization, which is then utilized to extract radiomic features from images in the radiomics branch.

Paper
Add Code

Equalized Robustness: Towards Sustainable Fairness Under Distributional Shifts

no code implementations • 29 Sep 2021 • Haotao Wang, Junyuan Hong, Jiayu Zhou, Zhangyang Wang

In this paper, we first propose a new fairness goal, termed Equalized Robustness (ER), to impose fair model robustness against unseen distribution shifts across majority and minority groups.

Fairness

Paper
Add Code

Lottery Image Prior

no code implementations • 29 Sep 2021 • Qiming Wu, Xiaohan Chen, Yifan Jiang, Pan Zhou, Zhangyang Wang

Drawing inspirations from the recently prosperous research on lottery ticket hypothesis (LTH), we conjecture and study a novel “lottery image prior” (LIP), stated as: given an (untrained or trained) DNN-based image prior, it will have a sparse subnetwork that can be training in isolation, to match the original DNN’s performance when being applied as a prior to various image inverse problems.

Compressive Sensing Image Reconstruction +1

Paper
Add Code

Peek-a-Boo: What (More) is Disguised in a Randomly Weighted Neural Network, and How to Find It Efficiently

no code implementations • ICLR 2022 • Xiaohan Chen, Jason Zhang, Zhangyang Wang

In this work, we define an extended class of subnetworks in randomly initialized NNs called disguised subnetworks, which are not only "hidden" in the random networks but also "disguised" -- hence can only be "unmasked" with certain transformations on weights.

Paper
Add Code

Skeleton-Graph: Long-Term 3D Motion Prediction From 2D Observations Using Deep Spatio-Temporal Graph CNNs

1 code implementation • 21 Sep 2021 • Abduallah Mohamed, Huancheng Chen, Zhangyang Wang, Christian Claudel

We propose Skeleton-Graph, a deep spatio-temporal graph CNN model that predicts the future 3D skeleton poses in a single pass from the 2D ones.

Ranked #1 on Trajectory Prediction on PROX

Autonomous Driving motion prediction +1

Paper
Code

GDP: Stabilized Neural Network Pruning via Gates with Differentiable Polarization

no code implementations • ICCV 2021 • Yi Guo, Huan Yuan, Jianchao Tan, Zhangyang Wang, Sen yang, Ji Liu

During the training process, the polarization effect will drive a subset of gates to smoothly decrease to exact zero, while other gates gradually stay away from zero by a large margin.

Model Compression Network Pruning

Paper
Add Code

Searching for Two-Stream Models in Multivariate Space for Video Recognition

no code implementations • ICCV 2021 • Xinyu Gong, Heng Wang, Zheng Shou, Matt Feiszli, Zhangyang Wang, Zhicheng Yan

We design a multivariate search space, including 6 search variables to capture a wide variety of choices in designing two-stream models.

Neural Architecture Search Video Recognition +1

Paper
Add Code

Font Completion and Manipulation by Cycling Between Multi-Modality Representations

1 code implementation • 30 Aug 2021 • Ye Yuan, Wuyang Chen, Zhaowen Wang, Matthew Fisher, Zhifei Zhang, Zhangyang Wang, Hailin Jin

The novel graph constructor maps a glyph's latent code to its graph representation that matches expert knowledge, which is trained to help the translation task.

Image-to-Image Translation Representation Learning +2

Paper
Code

Understanding and Accelerating Neural Architecture Search with Training-Free and Theory-Grounded Metrics

1 code implementation • 26 Aug 2021 • Wuyang Chen, Xinyu Gong, Junru Wu, Yunchao Wei, Humphrey Shi, Zhicheng Yan, Yi Yang, Zhangyang Wang

This work targets designing a principled and unified training-free framework for Neural Architecture Search (NAS), with high performance, low cost, and in-depth interpretation.

Neural Architecture Search

Paper
Code

Bag of Tricks for Training Deeper Graph Neural Networks: A Comprehensive Benchmark Study

1 code implementation • 24 Aug 2021 • Tianlong Chen, Kaixiong Zhou, Keyu Duan, Wenqing Zheng, Peihao Wang, Xia Hu, Zhangyang Wang

In view of those, we present the first fair and reproducible benchmark dedicated to assessing the "tricks" of training deep GNNs.

125

Paper
Code

SSH: A Self-Supervised Framework for Image Harmonization

1 code implementation • ICCV 2021 • Yifan Jiang, He Zhang, Jianming Zhang, Yilin Wang, Zhe Lin, Kalyan Sunkavalli, Simon Chen, Sohrab Amirghodsi, Sarah Kong, Zhangyang Wang

Image harmonization aims to improve the quality of image compositing by matching the "appearance" (\eg, color tone, brightness and contrast) between foreground and background images.

Benchmarking Data Augmentation +1

Paper
Code

Federated Adversarial Debiasing for Fair and Transferable Representations

1 code implementation • the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining 2021 • Junyuan Hong, Zhuangdi Zhu, Shuyang Yu, Zhangyang Wang, Hiroko Dodge, Jiayu Zhou

While adversarial learning is commonly used in centralized learning for mitigating bias, there are significant barriers when extending it to the federated framework.

Federated Learning Unsupervised Domain Adaptation

Paper
Code

CERL: A Unified Optimization Framework for Light Enhancement with Realistic Noise

1 code implementation • 1 Aug 2021 • Zeyuan Chen, Yifan Jiang, Dong Liu, Zhangyang Wang

We present \underline{C}oordinated \underline{E}nhancement for \underline{R}eal-world \underline{L}ow-light Noisy Images (CERL), that seamlessly integrates light enhancement and noise suppression parts into a unified and physics-grounded optimization framework.

Denoising

Paper
Code

Black-Box Diagnosis and Calibration on GAN Intra-Mode Collapse: A Pilot Study

1 code implementation • 23 Jul 2021 • Zhenyu Wu, Zhaowen Wang, Ye Yuan, Jianming Zhang, Zhangyang Wang, Hailin Jin

Existing diversity tests of samples from GANs are usually conducted qualitatively on a small scale, and/or depends on the access to original training data as well as the trained model parameters.

Image Generation

Paper
Code

DANCE: DAta-Network Co-optimization for Efficient Segmentation Model Training and Inference

no code implementations • 16 Jul 2021 • Chaojian Li, Wuyang Chen, Yuchen Gu, Tianlong Chen, Yonggan Fu, Zhangyang Wang, Yingyan Lin

Semantic segmentation for scene understanding is nowadays widely demanded, raising significant challenges for the algorithm efficiency, especially its applications on resource-limited platforms.

Scene Understanding Segmentation +1

Paper
Add Code

Sanity Checks for Lottery Tickets: Does Your Winning Ticket Really Win the Jackpot?

2 code implementations • NeurIPS 2021 • Xiaolong Ma, Geng Yuan, Xuan Shen, Tianlong Chen, Xuxi Chen, Xiaohan Chen, Ning Liu, Minghai Qin, Sijia Liu, Zhangyang Wang, Yanzhi Wang

Based on our analysis, we summarize a guideline for parameter settings in regards of specific architecture characteristics, which we hope to catalyze the research progress on the topic of lottery ticket hypothesis.

139

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.