Search Results for author: Li Shen

Found 208 papers, 88 papers with code

Federated Learning with Only Positive Labels by Exploring Label Correlations

no code implementations24 Apr 2024 Xuming An, Dui Wang, Li Shen, Yong Luo, Han Hu, Bo Du, Yonggang Wen, DaCheng Tao

Specifically, FedALC estimates the label correlations in the class embedding learning for different label pairs and utilizes it to improve the model training.

Federated Learning Multi-Label Classification

Continuous Spiking Graph Neural Networks

no code implementations2 Apr 2024 Nan Yin, Mengzhu Wan, Li Shen, Hitesh Laxmichand Patel, Baopu Li, Bin Gu, Huan Xiong

Inspired by recent spiking neural networks (SNNs), which emulate a biological inference process and provide an energy-efficient neural architecture, we incorporate the SNNs with CGNNs in a unified framework, named Continuous Spiking Graph Neural Networks (COS-GNN).

Heterogeneous Federated Learning with Splited Language Model

no code implementations24 Mar 2024 Yifan Shi, Yuhui Zhang, Ziyue Huang, Xiaofeng Yang, Li Shen, Wei Chen, Xueqian Wang

Federated Split Learning (FSL) is a promising distributed learning paradigm in practice, which gathers the strengths of both Federated Learning (FL) and Split Learning (SL) paradigms, to ensure model privacy while diminishing the resource overhead of each client, especially on large transformer models in a resource-constrained environment, e. g., Internet of Things (IoT).

Federated Learning Language Modelling

Building Accurate Translation-Tailored LLMs with Language Aware Instruction Tuning

no code implementations21 Mar 2024 Changtong Zan, Liang Ding, Li Shen, Yibing Zhen, Weifeng Liu, DaCheng Tao

In this work, we design a two-stage fine-tuning algorithm to improve the instruction-following ability (especially the translation direction) of LLMs.

In-Context Learning Instruction Following +1

A Unified and General Framework for Continual Learning

1 code implementation20 Mar 2024 Zhenyi Wang, Yan Li, Li Shen, Heng Huang

Extensive experiments on CL benchmarks and theoretical analysis demonstrate the effectiveness of the proposed refresh learning.

Continual Learning

Communication-Efficient Distributed Learning with Local Immediate Error Compensation

no code implementations19 Feb 2024 Yifei Cheng, Li Shen, Linli Xu, Xun Qian, Shiwei Wu, Yiming Zhou, Tie Zhang, DaCheng Tao, Enhong Chen

However, existing compression methods either perform only unidirectional compression in one iteration with higher communication cost, or bidirectional compression with slower convergence rate.

Revisiting Knowledge Distillation for Autoregressive Language Models

no code implementations19 Feb 2024 Qihuang Zhong, Liang Ding, Li Shen, Juhua Liu, Bo Du, DaCheng Tao

Knowledge distillation (KD) is a common approach to compress a teacher model to reduce its inference cost and memory footprint, by training a smaller student model.

Knowledge Distillation

Step-On-Feet Tuning: Scaling Self-Alignment of LLMs via Bootstrapping

no code implementations12 Feb 2024 Haoyu Wang, Guozheng Ma, Ziqiao Meng, Zeyu Qin, Li Shen, Zhong Zhang, Bingzhe Wu, Liu Liu, Yatao Bian, Tingyang Xu, Xueqian Wang, Peilin Zhao

To further exploit the capabilities of bootstrapping, we investigate and adjust the training order of data, which yields improved performance of the model.

In-Context Learning

Representation Surgery for Multi-Task Model Merging

1 code implementation5 Feb 2024 Enneng Yang, Li Shen, Zhenyi Wang, Guibing Guo, Xiaojun Chen, Xingwei Wang, DaCheng Tao

That is, there is a significant discrepancy in the representation distribution between the merged and individual models, resulting in poor performance of merged MTL.

Computational Efficiency Multi-Task Learning

Merging Multi-Task Models via Weight-Ensembling Mixture of Experts

no code implementations1 Feb 2024 Anke Tang, Li Shen, Yong Luo, Nan Yin, Lefei Zhang, DaCheng Tao

A notable challenge is mitigating the interference between parameters of different models, which can substantially deteriorate performance.

Multimodal Neurodegenerative Disease Subtyping Explained by ChatGPT

no code implementations31 Jan 2024 Diego Machado Reyes, Hanqing Chao, Juergen Hahn, Li Shen, Pingkun Yan

Thus, we propose a multimodal framework that uses early-stage indicators such as imaging, genetics and clinical assessments to classify AD patients into subtypes at early stages.

Solving Continual Offline Reinforcement Learning with Decision Transformer

no code implementations16 Jan 2024 Kaixin Huang, Li Shen, Chen Zhao, Chun Yuan, DaCheng Tao

We aim to investigate whether Decision Transformer (DT), another offline RL paradigm, can serve as a more suitable offline continuous learner to address these issues.

Offline RL reinforcement-learning +1

OOP: Object-Oriented Programming Evaluation Benchmark for Large Language Models

1 code implementation12 Jan 2024 Shuai Wang, Liang Ding, Li Shen, Yong Luo, Bo Du, DaCheng Tao

Advancing automated programming necessitates robust and comprehensive code generation benchmarks, yet current evaluation frameworks largely neglect object-oriented programming (OOP) in favor of functional programming (FP), e. g., HumanEval and MBPP.

Code Generation

Siamese Networks with Soft Labels for Unsupervised Lesion Detection and Patch Pretraining on Screening Mammograms

no code implementations10 Jan 2024 Kevin Van Vorst, Li Shen

Self-supervised learning has become a popular way to pretrain a deep learning model and then transfer it to perform downstream tasks.

Lesion Detection Self-Supervised Learning

Neural Network Approximation for Pessimistic Offline Reinforcement Learning

no code implementations19 Dec 2023 Di wu, Yuling Jiao, Li Shen, Haizhao Yang, Xiliang Lu

In this paper, we establish a non-asymptotic estimation error of pessimistic offline RL using general neural network approximation with $\mathcal{C}$-mixing data regarding the structure of networks, the dimension of datasets, and the concentrability of data coverage, under mild assumptions.

Offline RL reinforcement-learning +1

Concrete Subspace Learning based Interference Elimination for Multi-task Model Fusion

1 code implementation11 Dec 2023 Anke Tang, Li Shen, Yong Luo, Liang Ding, Han Hu, Bo Du, DaCheng Tao

At the upper level, we focus on learning a shared Concrete mask to identify the subspace, while at the inner level, model merging is performed to maximize the performance of the merged model.

Meta-Learning

Take an Irregular Route: Enhance the Decoder of Time-Series Forecasting Transformer

1 code implementation10 Dec 2023 Li Shen, Yuning Wei, Yangzhu Wang, Hongguang Li

With the development of Internet of Things (IoT) systems, precise long-term forecasting method is requisite for decision makers to evaluate current statuses and formulate future policies.

Time Series Time Series Forecasting

Embodied Multi-Modal Agent trained by an LLM from a Parallel TextWorld

1 code implementation28 Nov 2023 Yijun Yang, Tianyi Zhou, Kanxue Li, Dapeng Tao, Lusong Li, Li Shen, Xiaodong He, Jing Jiang, Yuhui Shi

While large language models (LLMs) excel in a simulated world of texts, they struggle to interact with the more realistic world without perceptions of other modalities such as visual or audio signals.

Imitation Learning

Task-Distributionally Robust Data-Free Meta-Learning

no code implementations23 Nov 2023 Zixuan Hu, Li Shen, Zhenyi Wang, Yongxian Wei, Baoyuan Wu, Chun Yuan, DaCheng Tao

TDS leads to a biased meta-learner because of the skewed task distribution towards newly generated tasks.

Meta-Learning Model Selection

Rethinking SIGN Training: Provable Nonconvex Acceleration without First- and Second-Order Gradient Lipschitz

no code implementations23 Oct 2023 Tao Sun, Congliang Chen, Peng Qiao, Li Shen, Xinwang Liu, Dongsheng Li

Sign-based stochastic methods have gained attention due to their ability to achieve robust performance despite using only the sign information for parameter updates.

Zero-Shot Sharpness-Aware Quantization for Pre-trained Language Models

no code implementations20 Oct 2023 Miaoxi Zhu, Qihuang Zhong, Li Shen, Liang Ding, Juhua Liu, Bo Du, DaCheng Tao

The key algorithm in solving ZSAQ is the SAM-SGA optimization, which aims to improve the quantization accuracy and model generalization via optimizing a minimax problem.

Language Modelling Quantization

Enhancing Column Generation by Reinforcement Learning-Based Hyper-Heuristic for Vehicle Routing and Scheduling Problems

no code implementations15 Oct 2023 Kuan Xu, Li Shen, Lindong Liu

In addition, we specify RLHH to solve two typical combinatorial optimization problems: Vehicle Routing Problem with Time Windows (VRPTW) and Bus Driver Scheduling Problem (BDSP).

Combinatorial Optimization Scheduling

Merging Experts into One: Improving Computational Efficiency of Mixture of Experts

1 code implementation15 Oct 2023 Shwai He, Run-Ze Fan, Liang Ding, Li Shen, Tianyi Zhou, DaCheng Tao

Although a sparse Mixture of Experts (MoE) can reduce the cost by activating a small subset of parameters (e. g., one expert) for each input, its computation escalates significantly if increasing the number of activated experts, limiting its practical utility.

Computational Efficiency

Diversifying the Mixture-of-Experts Representation for Language Models with Orthogonal Optimizer

no code implementations15 Oct 2023 Boan Liu, Liang Ding, Li Shen, Keqin Peng, Yu Cao, Dazhao Cheng, DaCheng Tao

The Mixture of Experts (MoE) has emerged as a highly successful technique in deep learning, based on the principle of divide-and-conquer to maximize model capacity without significant additional computational cost.

Question Answering

Learn From Model Beyond Fine-Tuning: A Survey

1 code implementation12 Oct 2023 Hongling Zheng, Li Shen, Anke Tang, Yong Luo, Han Hu, Bo Du, DaCheng Tao

LFM focuses on the research, modification, and design of FM based on the model interface, so as to better understand the model structure and weights (in a black box environment), and to generalize the model to downstream tasks.

Meta-Learning Model Editing

Debias the Training of Diffusion Models

no code implementations12 Oct 2023 Hu Yu, Li Shen, Jie Huang, Man Zhou, Hongsheng Li, Feng Zhao

Diffusion models have demonstrated compelling generation quality by optimizing the variational lower bound through a simple denoising score matching loss.

Denoising

ChatGPT for Computational Topology

1 code implementation11 Oct 2023 Jian Liu, Li Shen, Guo-Wei Wei

This work serves as an initial step towards effectively transforming pure mathematical theories into practical computational tools, with the ultimate goal of enabling real applications across diverse fields.

Topological Data Analysis

Revisiting Plasticity in Visual Reinforcement Learning: Data, Modules and Training Stages

1 code implementation11 Oct 2023 Guozheng Ma, Lu Li, Sen Zhang, Zixuan Liu, Zhen Wang, Yixin Chen, Li Shen, Xueqian Wang, DaCheng Tao

Plasticity, the ability of a neural network to evolve with new data, is crucial for high-performance and sample-efficient visual reinforcement learning (VRL).

Data Augmentation reinforcement-learning

DiffCPS: Diffusion Model based Constrained Policy Search for Offline Reinforcement Learning

1 code implementation9 Oct 2023 Longxiang He, Li Shen, Linrui Zhang, Junbo Tan, Xueqian Wang

Constrained policy search (CPS) is a fundamental problem in offline reinforcement learning, which is generally solved by advantage weighted regression (AWR).

D4RL Offline RL +1

Asymmetrically Decentralized Federated Learning

no code implementations8 Oct 2023 Qinglun Li, Miao Zhang, Nan Yin, Quanjun Yin, Li Shen

To further improve algorithm performance and alleviate local heterogeneous overfitting in Federated Learning (FL), our algorithm combines the Sharpness Aware Minimization (SAM) optimizer and local momentum.

Federated Learning

Parameter Efficient Multi-task Model Fusion with Partial Linearization

1 code implementation7 Oct 2023 Anke Tang, Li Shen, Yong Luo, Yibing Zhan, Han Hu, Bo Du, Yixin Chen, DaCheng Tao

We demonstrate that our partial linearization technique enables a more effective fusion of multiple tasks into a single model, outperforming standard adapter tuning and task arithmetic alone.

Which mode is better for federated learning? Centralized or Decentralized

no code implementations5 Oct 2023 Yan Sun, Li Shen, DaCheng Tao

Both centralized and decentralized approaches have shown excellent performance and great application value in federated learning (FL).

Federated Learning valid

Efficient Federated Prompt Tuning for Black-box Large Pre-trained Models

no code implementations4 Oct 2023 Zihao Lin, Yan Sun, Yifan Shi, Xueqian Wang, Lifu Huang, Li Shen, DaCheng Tao

With the blowout development of pre-trained models (PTMs), the efficient tuning of these models for diverse downstream applications has emerged as a pivotal research concern.

AdaMerging: Adaptive Model Merging for Multi-Task Learning

1 code implementation4 Oct 2023 Enneng Yang, Zhenyi Wang, Li Shen, Shiwei Liu, Guibing Guo, Xingwei Wang, DaCheng Tao

This approach aims to autonomously learn the coefficients for model merging, either in a task-wise or layer-wise manner, without relying on the original training data.

Multi-Task Learning

Towards Stable Backdoor Purification through Feature Shift Tuning

1 code implementation NeurIPS 2023 Rui Min, Zeyu Qin, Li Shen, Minhao Cheng

Our analysis shows that with the low poisoning rate, the entanglement between backdoor and clean features undermines the effect of tuning-based defenses.

Unlikelihood Tuning on Negative Samples Amazingly Improves Zero-Shot Translation

1 code implementation28 Sep 2023 Changtong Zan, Liang Ding, Li Shen, Yibin Lei, Yibing Zhan, Weifeng Liu, DaCheng Tao

Zero-shot translation (ZST), which is generally based on a multilingual neural machine translation model, aims to translate between unseen language pairs in training data.

Machine Translation Navigate +2

Deep Model Fusion: A Survey

no code implementations27 Sep 2023 Weishi Li, Yong Peng, Miao Zhang, Liang Ding, Han Hu, Li Shen

Specifically, we categorize existing deep model fusion methods as four-fold: (1) "Mode connectivity", which connects the solutions in weight space via a path of non-increasing loss, in order to obtain better initialization for model fusion; (2) "Alignment" matches units between neural networks to create better conditions for fusion; (3) "Weight average", a classical model fusion method, averages the weights of multiple models to obtain more accurate results closer to the optimal solution; (4) "Ensemble learning" combines the outputs of diverse models, which is a foundational technique for improving the accuracy and robustness of the final model.

Ensemble Learning

FedLALR: Client-Specific Adaptive Learning Rates Achieve Linear Speedup for Non-IID Data

no code implementations18 Sep 2023 Hao Sun, Li Shen, Shixiang Chen, Jingwei Sun, Jing Li, Guangzhong Sun, DaCheng Tao

Federated learning is an emerging distributed machine learning method, enables a large number of clients to train a model without exchanging their local data.

Federated Learning Scheduling

Continual Learning From a Stream of APIs

no code implementations31 Aug 2023 Enneng Yang, Zhenyi Wang, Li Shen, Nan Yin, Tongliang Liu, Guibing Guo, Xingwei Wang, DaCheng Tao

Next, we train the CL model by minimizing the gap between the responses of the CL model and the black-box API on synthetic data, to transfer the API's knowledge to the CL model.

Continual Learning

MerA: Merging Pretrained Adapters For Few-Shot Learning

no code implementations30 Aug 2023 Shwai He, Run-Ze Fan, Liang Ding, Li Shen, Tianyi Zhou, DaCheng Tao

Adapter tuning, which updates only a few parameters, has become a mainstream method for fine-tuning pretrained language models to downstream tasks.

Few-Shot Learning MRPC

Can Linguistic Knowledge Improve Multimodal Alignment in Vision-Language Pretraining?

1 code implementation24 Aug 2023 Fei Wang, Liang Ding, Jun Rao, Ye Liu, Li Shen, Changxing Ding

The multimedia community has shown a significant interest in perceiving and representing the physical world with multimodal pretrained neural network models, and among them, the visual-language pertaining (VLP) is, currently, the most captivating topic.

Attribute Negation +1

Master-slave Deep Architecture for Top-K Multi-armed Bandits with Non-linear Bandit Feedback and Diversity Constraints

1 code implementation24 Aug 2023 Hanchi Huang, Li Shen, Deheng Ye, Wei Liu

We propose a novel master-slave architecture to solve the top-$K$ combinatorial multi-armed bandits problem with non-linear bandit feedback and diversity constraints, which, to the best of our knowledge, is the first combinatorial bandits setting considering diversity constraints under bandit feedback.

Multi-Armed Bandits

Towards Understanding the Generalizability of Delayed Stochastic Gradient Descent

no code implementations18 Aug 2023 Xiaoge Deng, Li Shen, Shengwei Li, Tao Sun, Dongsheng Li, DaCheng Tao

Stochastic gradient descent (SGD) performed in an asynchronous manner plays a crucial role in training large-scale machine learning models.

DFedADMM: Dual Constraints Controlled Model Inconsistency for Decentralized Federated Learning

no code implementations16 Aug 2023 Qinglun Li, Li Shen, Guanghao Li, Quanjun Yin, DaCheng Tao

To address the communication burden issues associated with federated learning (FL), decentralized federated learning (DFL) discards the central server and establishes a decentralized communication network, where each client communicates only with neighboring clients.

Federated Learning

LGViT: Dynamic Early Exiting for Accelerating Vision Transformer

1 code implementation1 Aug 2023 Guanyu Xu, Jiawei Hao, Li Shen, Han Hu, Yong Luo, Hui Lin, Jialie Shen

Recently, the efficient deployment and acceleration of powerful vision transformers (ViTs) on resource-limited edge devices for providing multimedia services have become attractive tasks.

Efficient Federated Learning via Local Adaptive Amended Optimizer with Linear Speedup

no code implementations30 Jul 2023 Yan Sun, Li Shen, Hao Sun, Liang Ding, DaCheng Tao

Adaptive optimization has achieved notable success for distributed learning while extending adaptive optimizer to federated Learning (FL) suffers from severe inefficiency, including (i) rugged convergence due to inaccurate gradient estimation in global adaptive optimizer; (ii) client drifts exacerbated by local over-fitting with the local adaptive optimizer.

Federated Learning

High-Resolution Volumetric Reconstruction for Clothed Humans

no code implementations25 Jul 2023 Sicong Tang, Guangyuan Wang, Qing Ran, Lingzhi Li, Li Shen, Ping Tan

We present a novel method for reconstructing clothed humans from a sparse set of, e. g., 1 to 6 RGB images.

Quantization

GBT: Two-stage transformer framework for non-stationary time series forecasting

1 code implementation17 Jul 2023 Li Shen, Yuning Wei, Yangzhu Wang

It decouples the prediction process of TSFT into two stages, including Auto-Regression stage and Self-Regression stage to tackle the problem of different statistical properties between input and prediction sequences. Prediction results of Auto-Regression stage serve as a Good Beginning, i. e., a better initialization for inputs of Self-Regression stage.

regression Time Series +1

A Comprehensive Survey of Forgetting in Deep Learning Beyond Continual Learning

1 code implementation16 Jul 2023 Zhenyi Wang, Enneng Yang, Li Shen, Heng Huang

Through this comprehensive survey, we aspire to uncover potential solutions by drawing upon ideas and approaches from various fields that have dealt with forgetting.

Continual Learning Federated Learning +1

Boosting Backdoor Attack with A Learnable Poisoning Sample Selection Strategy

no code implementations14 Jul 2023 Zihao Zhu, Mingda Zhang, Shaokui Wei, Li Shen, Yanbo Fan, Baoyuan Wu

To further integrate it with normal training process, we then propose a learnable poisoning sample selection strategy to learn the mask together with the model parameters through a min-max optimization. Specifically, the outer loop aims to achieve the backdoor attack goal by minimizing the loss based on the selected samples, while the inner loop selects hard poisoning samples that impede this goal by maximizing the loss.

Backdoor Attack Data Poisoning

Systematic Investigation of Sparse Perturbed Sharpness-Aware Minimization Optimizer

1 code implementation30 Jun 2023 Peng Mi, Li Shen, Tianhe Ren, Yiyi Zhou, Tianshuo Xu, Xiaoshuai Sun, Tongliang Liu, Rongrong Ji, DaCheng Tao

Sharpness-Aware Minimization (SAM) is a popular solution that smooths the loss landscape by minimizing the maximized change of training loss when adding a perturbation to the weight.

Enhancing Adversarial Training via Reweighting Optimization Trajectory

1 code implementation25 Jun 2023 Tianjin Huang, Shiwei Liu, Tianlong Chen, Meng Fang, Li Shen, Vlaod Menkovski, Lu Yin, Yulong Pei, Mykola Pechenizkiy

Despite the fact that adversarial training has become the de facto method for improving the robustness of deep neural networks, it is well-known that vanilla adversarial training suffers from daunting robust overfitting, resulting in unsatisfactory robust generalization.

Adversarial Robustness

FDNet: Focal Decomposed Network for Efficient, Robust and Practical Time Series Forecasting

1 code implementation19 Jun 2023 Li Shen, Yuning Wei, Yangzhu Wang, Huaxin Qiu

Moreover, we propose focal input sequence decomposition method which decomposes input sequence in a focal manner for efficient and robust forecasting when facing Long Sequence Time series Input (LSTI) problem.

Inductive Bias Time Series +1

One-step Multi-view Clustering with Diverse Representation

no code implementations8 Jun 2023 Xinhang Wan, Jiyuan Liu, Xinwang Liu, Siwei Wang, Yi Wen, Tianjiao Wan, Li Shen, En Zhu

In light of this, we propose a one-step multi-view clustering with diverse representation method, which incorporates multi-view learning and $k$-means into a unified framework.

Clustering MULTI-VIEW LEARNING +1

CoCo: A Coupled Contrastive Framework for Unsupervised Domain Adaptive Graph Classification

no code implementations8 Jun 2023 Nan Yin, Li Shen, Mengzhu Wang, Long Lan, Zeyu Ma, Chong Chen, Xian-Sheng Hua, Xiao Luo

Although graph neural networks (GNNs) have achieved impressive achievements in graph classification, they often need abundant task-specific labels, which could be extensively costly to acquire.

Contrastive Learning Domain Adaptation +2

Are Large Kernels Better Teachers than Transformers for ConvNets?

1 code implementation30 May 2023 Tianjin Huang, Lu Yin, Zhenyu Zhang, Li Shen, Meng Fang, Mykola Pechenizkiy, Zhangyang Wang, Shiwei Liu

We hereby carry out a first-of-its-kind study unveiling that modern large-kernel ConvNets, a compelling competitor to Vision Transformers, are remarkably more effective teachers for small-kernel ConvNets, due to more similar architectures.

Knowledge Distillation

Compact Real-time Radiance Fields with Neural Codebook

no code implementations29 May 2023 Lingzhi Li, Zhongshu Wang, Zhen Shen, Li Shen, Ping Tan

Reconstructing neural radiance fields with explicit volumetric representations, demonstrated by Plenoxels, has shown remarkable advantages on training and rendering efficiency, while grid-based representations typically induce considerable overhead for storage and transmission.

Learning to Learn from APIs: Black-Box Data-Free Meta-Learning

1 code implementation28 May 2023 Zixuan Hu, Li Shen, Zhenyi Wang, Baoyuan Wu, Chun Yuan, DaCheng Tao

Data-free meta-learning (DFML) aims to enable efficient learning of new tasks by meta-learning from a collection of pre-trained models without access to the training data.

Few-Shot Learning Knowledge Distillation

Incomplete Multimodal Learning for Complex Brain Disorders Prediction

no code implementations25 May 2023 Reza Shirkavand, Liang Zhan, Heng Huang, Li Shen, Paul M. Thompson

Especially in studies of brain diseases, research cohorts may include both neuroimaging data and genetic data, but for practical clinical diagnosis, we often need to make disease predictions only based on neuroimages.

Data Integration

Towards More Suitable Personalization in Federated Learning via Decentralized Partial Model Training

no code implementations24 May 2023 Yifan Shi, Yingqi Liu, Yan Sun, Zihao Lin, Li Shen, Xueqian Wang, DaCheng Tao

Personalized federated learning (PFL) aims to produce the greatest personalized model for each client to face an insurmountable problem--data heterogeneity in real FL systems.

Personalized Federated Learning

Dynamic Regularized Sharpness Aware Minimization in Federated Learning: Approaching Global Consistency and Smooth Landscape

1 code implementation19 May 2023 Yan Sun, Li Shen, Shixiang Chen, Liang Ding, DaCheng Tao

In federated learning (FL), a cluster of local clients are chaired under the coordination of the global server and cooperatively train one model with privacy protection.

Federated Learning

Prompt-Tuning Decision Transformer with Preference Ranking

no code implementations16 May 2023 Shengchao Hu, Li Shen, Ya zhang, DaCheng Tao

Our work contributes to the advancement of prompt-tuning approaches in RL, providing a promising direction for optimizing large RL agents for specific preference tasks.

DaGAN++: Depth-Aware Generative Adversarial Network for Talking Head Video Generation

1 code implementation10 May 2023 Fa-Ting Hong, Li Shen, Dan Xu

In this work, firstly, we present a novel self-supervised method for learning dense 3D facial geometry (ie, depth) from face videos, without requiring camera parameters and 3D geometry annotations in training.

Generative Adversarial Network Keypoint Estimation +2

Towards the Flatter Landscape and Better Generalization in Federated Learning under Client-level Differential Privacy

1 code implementation1 May 2023 Yifan Shi, Kang Wei, Li Shen, Yingqi Liu, Xueqian Wang, Bo Yuan, DaCheng Tao

To defend the inference attacks and mitigate the sensitive information leakages in Federated Learning (FL), client-level Differentially Private FL (DPFL) is the de-facto standard for privacy protection by clipping local updates and adding random noise.

Federated Learning

Evaluate Geometry of Radiance Fields with Low-frequency Color Prior

1 code implementation10 Apr 2023 Qihang Fang, Yafei Song, Keqiang Li, Li Shen, Huaiyu Wu, Gang Xiong, Liefeng Bo

From this insight, given a reconstructed density field and observation images, we design a closed-form method to approximate the color field with low-frequency spherical harmonics, and compute the inverse mean residual color.

3D Reconstruction Novel View Synthesis

On Efficient Training of Large-Scale Deep Learning Models: A Literature Review

no code implementations7 Apr 2023 Li Shen, Yan Sun, Zhiyuan Yu, Liang Ding, Xinmei Tian, DaCheng Tao

The field of deep learning has witnessed significant progress, particularly in computer vision (CV), natural language processing (NLP), and speech.

Quantum Imitation Learning

no code implementations4 Apr 2023 Zhihao Cheng, Kaining Zhang, Li Shen, DaCheng Tao

Despite remarkable successes in solving various complex decision-making tasks, training an imitation learning (IL) algorithm with deep neural networks (DNNs) suffers from the high computation burden.

Behavioural cloning

Towards Making the Most of ChatGPT for Machine Translation

1 code implementation24 Mar 2023 Keqin Peng, Liang Ding, Qihuang Zhong, Li Shen, Xuebo Liu, Min Zhang, Yuanxin Ouyang, DaCheng Tao

We show that: 1) The performance of ChatGPT depends largely on temperature, and a lower temperature usually can achieve better performance; 2) Emphasizing the task information can further improve ChatGPT's performance, particularly in complex MT tasks; 3) Introducing domain information can elicit ChatGPT's generalization ability and improve its performance in the specific domain; 4) ChatGPT tends to generate hallucinations for non-English-centric MT tasks, which can be partially addressed by our proposed prompts but still need to be highlighted for the MT/NLP community.

In-Context Learning Machine Translation +2

Robust Generalization against Photon-Limited Corruptions via Worst-Case Sharpness Minimization

2 code implementations CVPR 2023 Zhuo Huang, Miaoxi Zhu, Xiaobo Xia, Li Shen, Jun Yu, Chen Gong, Bo Han, Bo Du, Tongliang Liu

Experimentally, we simulate photon-limited corruptions using CIFAR10/100 and ImageNet30 datasets and show that SharpDRO exhibits a strong generalization ability against severe corruptions and exceeds well-known baseline methods with large performance gains.

Architecture, Dataset and Model-Scale Agnostic Data-free Meta-Learning

1 code implementation CVPR 2023 Zixuan Hu, Li Shen, Zhenyi Wang, Tongliang Liu, Chun Yuan, DaCheng Tao

The goal of data-free meta-learning is to learn useful prior knowledge from a collection of pre-trained models without accessing their training data.

Meta-Learning

Make Landscape Flatter in Differentially Private Federated Learning

1 code implementation CVPR 2023 Yifan Shi, Yingqi Liu, Kang Wei, Li Shen, Xueqian Wang, DaCheng Tao

Specifically, DP-FedSAM integrates Sharpness Aware Minimization (SAM) optimizer to generate local flatness models with better stability and weight perturbation robustness, which results in the small norm of local updates and robustness to DP noise, thereby improving the performance.

Federated Learning

Visual Prompt Based Personalized Federated Learning

no code implementations15 Mar 2023 Guanghao Li, Wansen Wu, Yan Sun, Li Shen, Baoyuan Wu, DaCheng Tao

Then, the local model is trained on the input composed of raw data and a visual prompt to learn the distribution information contained in the prompt.

Image Classification Personalized Federated Learning

Graph Decision Transformer

no code implementations7 Mar 2023 Shengchao Hu, Li Shen, Ya zhang, DaCheng Tao

Offline reinforcement learning (RL) is a challenging task, whose objective is to learn policies from static trajectory data without interacting with the environment.

Offline RL OpenAI Gym +1

SGDA: Towards 3D Universal Pulmonary Nodule Detection via Slice Grouped Domain Attention

1 code implementation7 Mar 2023 Rui Xu, Zhi Liu, Yong Luo, Han Hu, Li Shen, Bo Du, Kaiming Kuang, Jiancheng Yang

To address this issue, we propose a slice grouped domain attention (SGDA) module to enhance the generalization capability of the pulmonary nodule detection networks.

Computed Tomography (CT)

AdaSAM: Boosting Sharpness-Aware Minimization with Adaptive Learning Rate and Momentum for Training Deep Neural Networks

no code implementations1 Mar 2023 Hao Sun, Li Shen, Qihuang Zhong, Liang Ding, Shixiang Chen, Jingwei Sun, Jing Li, Guangzhong Sun, DaCheng Tao

Integrating SAM with adaptive learning rate and momentum acceleration, dubbed AdaSAM, has already been explored empirically to train large-scale deep neural networks without theoretical guarantee due to the triple difficulties in analyzing the coupled perturbation step, adaptive learning rate and momentum step.

Subspace based Federated Unlearning

no code implementations24 Feb 2023 Guanghao Li, Li Shen, Yan Sun, Yue Hu, Han Hu, DaCheng Tao

Federated learning (FL) enables multiple clients to train a machine learning model collaboratively without exchanging their local data.

Federated Learning

Fusion of Global and Local Knowledge for Personalized Federated Learning

1 code implementation21 Feb 2023 Tiansheng Huang, Li Shen, Yan Sun, Weiwei Lin, DaCheng Tao

Personalized federated learning, as a variant of federated learning, trains customized models for clients using their heterogeneously distributed data.

Personalized Federated Learning

Establishing group-level brain structural connectivity incorporating anatomical knowledge under latent space modeling

no code implementations21 Feb 2023 Selena Wang, Yiting Wang, Frederick H. Xu, Li Shen, Yize Zhao

By applying the ABC model to study brain structural connectivity stratified by sex among Alzheimer's Disease (AD) subjects and healthy controls incorporating the anatomical attributes (volume, thickness and area) on nodes, our method shows superior predictive power on out-of-sample structural connectivity and identifies meaningful sex-specific network neuromarkers for AD.

FedSpeed: Larger Local Interval, Less Communication Round, and Higher Generalization Accuracy

1 code implementation21 Feb 2023 Yan Sun, Li Shen, Tiansheng Huang, Liang Ding, DaCheng Tao

Federated learning is an emerging distributed machine learning framework which jointly trains a global model via a large number of local devices with data privacy protections.

Federated Learning

Bag of Tricks for Effective Language Model Pretraining and Downstream Adaptation: A Case Study on GLUE

no code implementations18 Feb 2023 Qihuang Zhong, Liang Ding, Keqin Peng, Juhua Liu, Bo Du, Li Shen, Yibing Zhan, DaCheng Tao

This technical report briefly describes our JDExplore d-team's submission Vega v1 on the General Language Understanding Evaluation (GLUE) leaderboard, where GLUE is a collection of nine natural language understanding tasks, including question answering, linguistic acceptability, sentiment analysis, text similarity, paraphrase detection, and natural language inference.

Contrastive Learning Denoising +12

FedABC: Targeting Fair Competition in Personalized Federated Learning

no code implementations15 Feb 2023 Dui Wang, Li Shen, Yong Luo, Han Hu, Kehua Su, Yonggang Wen, DaCheng Tao

In particular, we adopt the ``one-vs-all'' training strategy in each client to alleviate the unfair competition between classes by constructing a personalized binary classification problem for each class.

Binary Classification Personalized Federated Learning

Enhance Local Consistency in Federated Learning: A Multi-Step Inertial Momentum Approach

no code implementations11 Feb 2023 Yixing Liu, Yan Sun, Zhengtao Ding, Li Shen, Bo Liu, DaCheng Tao

Federated learning (FL), as a collaborative distributed training paradigm with several edge computing devices under the coordination of a centralized server, is plagued by inconsistent local stationary points due to the heterogeneity of the local partial participation clients, which precipitates the local client-drifts problems and sparks off the unstable and slow convergence, especially on the aggravated heterogeneous dataset.

Edge-computing Federated Learning

Improving the Model Consistency of Decentralized Federated Learning

no code implementations8 Feb 2023 Yifan Shi, Li Shen, Kang Wei, Yan Sun, Bo Yuan, Xueqian Wang, DaCheng Tao

To mitigate the privacy leakages and communication burdens of Federated Learning (FL), decentralized FL (DFL) discards the central server and each client only communicates with its neighbors in a decentralized communication network.

Federated Learning

MetaMix: Towards Corruption-Robust Continual Learning With Temporally Self-Adaptive Data Transformation

no code implementations CVPR 2023 Zhenyi Wang, Li Shen, Donglin Zhan, Qiuling Suo, Yanjun Zhu, Tiehang Duan, Mingchen Gao

To make them trustworthy and robust to corruptions deployed in safety-critical scenarios, we propose a meta-learning framework of self-adaptive data augmentation to tackle the corruption robustness in CL.

Continual Learning Data Augmentation +1

Global Balanced Experts for Federated Long-Tailed Learning

1 code implementation ICCV 2023 Yaopei Zeng, Lei Liu, Li Liu, Li Shen, Shaoguo Liu, Baoyuan Wu

In particular, a proxy is derived from the accumulated gradients uploaded by the clients after local training, and is shared by all clients as the class prior for re-balance training.

Federated Learning Privacy Preserving

Data Augmented Flatness-aware Gradient Projection for Continual Learning

no code implementations ICCV 2023 Enneng Yang, Li Shen, Zhenyi Wang, Shiwei Liu, Guibing Guo, Xingwei Wang

In this paper, we first revisit the gradient projection method from the perspective of flatness of loss surface, and find that unflatness of the loss surface leads to catastrophic forgetting of the old tasks when the projection constraint is reduced to improve the performance of new tasks.

Continual Learning

On Transforming Reinforcement Learning by Transformer: The Development Trajectory

no code implementations29 Dec 2022 Shengchao Hu, Li Shen, Ya zhang, Yixin Chen, DaCheng Tao

Transformer, originally devised for natural language processing, has also attested significant success in computer vision.

Autonomous Driving reinforcement-learning +2

SVSBI: Sequence-based virtual screening of biomolecular interactions

1 code implementation27 Dec 2022 Li Shen, Hongsong Feng, Yuchi Qiu, Guo-Wei Wei

Virtual screening (VS) is an essential technique for understanding biomolecular interactions, particularly, drug design and discovery.

Drug Discovery Molecular Docking

Rethinking the Role of Pre-Trained Networks in Source-Free Domain Adaptation

no code implementations ICCV 2023 Wenyu Zhang, Li Shen, Chuan-Sheng Foo

We propose to distil useful target domain information through a co-learning strategy to improve target pseudolabel quality for finetuning the source model.

Representation Learning Source-Free Domain Adaptation +1

Evaluating Model-free Reinforcement Learning toward Safety-critical Tasks

no code implementations12 Dec 2022 Linrui Zhang, Qin Zhang, Li Shen, Bo Yuan, Xueqian Wang, DaCheng Tao

Despite a large number of reinforcement learning (RL) methods focusing on safety-critical tasks, there is still a lack of high-quality evaluation of those algorithms that adheres to safety constraints at each decision step under complex and unknown dynamics.

Autonomous Driving reinforcement-learning +2

4K-NeRF: High Fidelity Neural Radiance Fields at Ultra High Resolutions

1 code implementation9 Dec 2022 Zhongshu Wang, Lingzhi Li, Zhen Shen, Li Shen, Liefeng Bo

In this paper, we present a novel and effective framework, named 4K-NeRF, to pursue high fidelity view synthesis on the challenging scenarios of ultra high resolutions, building on the methodology of neural radiance fields (NeRF).

4k Vocal Bursts Intensity Prediction

Compressing Volumetric Radiance Fields to 1 MB

1 code implementation CVPR 2023 Lingzhi Li, Zhen Shen, Zhongshu Wang, Li Shen, Liefeng Bo

Approximating radiance fields with volumetric grids is one of promising directions for improving NeRF, represented by methods like Plenoxels and DVGO, which achieve super-fast training convergence and real-time rendering.

Model Compression Neural Rendering +1

AdaTask: A Task-aware Adaptive Learning Rate Approach to Multi-task Learning

no code implementations28 Nov 2022 Enneng Yang, Junwei Pan, Ximei Wang, Haibin Yu, Li Shen, Xihua Chen, Lei Xiao, Jie Jiang, Guibing Guo

In this paper, we propose to measure the task dominance degree of a parameter by the total updates of each task on this parameter.

Multi-Task Learning Recommendation Systems

Curriculum-based Asymmetric Multi-task Reinforcement Learning

1 code implementation7 Nov 2022 Hanchi Huang, Deheng Ye, Li Shen, Wei Liu

To mitigate the negative influence of customizing the one-off training order in curriculum-based AMTL, CAMRL switches its training mode between parallel single-task RL and asymmetric multi-task RL (MTRL), according to an indicator regarding the training time, the overall performance, and the performance gap among tasks.

Multi-Task Learning reinforcement-learning +1

Streaming Radiance Fields for 3D Video Synthesis

1 code implementation26 Oct 2022 Lingzhi Li, Zhen Shen, Zhongshu Wang, Li Shen, Ping Tan

Instead of training a single model that combines all the frames, we formulate the dynamic modeling problem with an incremental learning paradigm in which per-frame model difference is trained to complement the adaption of a base model on the current frame.

Incremental Learning Model Optimization +1

Boosting the Transferability of Adversarial Attacks with Reverse Adversarial Perturbation

3 code implementations12 Oct 2022 Zeyu Qin, Yanbo Fan, Yi Liu, Li Shen, Yong Zhang, Jue Wang, Baoyuan Wu

Furthermore, RAP can be naturally combined with many existing black-box attack techniques, to further boost the transferability.

Adversarial Attack

Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation Approach

1 code implementation11 Oct 2022 Peng Mi, Li Shen, Tianhe Ren, Yiyi Zhou, Xiaoshuai Sun, Rongrong Ji, DaCheng Tao

One of the popular solutions is Sharpness-Aware Minimization (SAM), which smooths the loss landscape via minimizing the maximized change of training loss when adding a perturbation to the weight.

Improving Sharpness-Aware Minimization with Fisher Mask for Better Generalization on Language Models

1 code implementation11 Oct 2022 Qihuang Zhong, Liang Ding, Li Shen, Peng Mi, Juhua Liu, Bo Du, DaCheng Tao

Fine-tuning large pretrained language models on a limited training corpus usually suffers from poor generalization.

Strength-Adaptive Adversarial Training

no code implementations4 Oct 2022 Chaojian Yu, Dawei Zhou, Li Shen, Jun Yu, Bo Han, Mingming Gong, Nannan Wang, Tongliang Liu

Firstly, applying a pre-specified perturbation budget on networks of various model capacities will yield divergent degree of robustness disparity between natural and robust accuracies, which deviates from robust network's desideratum.

Adversarial Robustness Scheduling

Tensor-Based Multi-Modality Feature Selection and Regression for Alzheimer's Disease Diagnosis

1 code implementation23 Sep 2022 Jun Yu, Zhaoming Kong, Liang Zhan, Li Shen, Lifang He

The assessment of Alzheimer's Disease (AD) and Mild Cognitive Impairment (MCI) associated with brain changes remains a challenging task.

feature selection regression

Meta-Learning with Less Forgetting on Large-Scale Non-Stationary Task Distributions

1 code implementation3 Sep 2022 Zhenyi Wang, Li Shen, Le Fang, Qiuling Suo, Donglin Zhan, Tiehang Duan, Mingchen Gao

Two key challenges arise in this more realistic setting: (i) how to use unlabeled data in the presence of a large amount of unlabeled out-of-distribution (OOD) data; and (ii) how to prevent catastrophic forgetting on previously learned task distributions due to the task distribution shift.

Meta-Learning

Respecting Time Series Properties Makes Deep Time Series Forecasting Perfect

1 code implementation22 Jul 2022 Li Shen, Yuning Wei, Yangzhu Wang

Thanks to the core idea of respecting time series properties, no matter in which forecasting format, RTNet shows obviously superior forecasting performances compared with dozens of other SOTA time series forecasting baselines in three real-world benchmark datasets.

Time Series Time Series Forecasting

Improving Task-free Continual Learning by Distributionally Robust Memory Evolution

1 code implementation15 Jul 2022 Zhenyi Wang, Li Shen, Le Fang, Qiuling Suo, Tiehang Duan, Mingchen Gao

To address these problems, for the first time, we propose a principled memory evolution framework to dynamically evolve the memory data distribution by making the memory buffer gradually harder to be memorized with distributionally robust optimization (DRO).

Continual Learning

Harnessing Out-Of-Distribution Examples via Augmenting Content and Style

1 code implementation7 Jul 2022 Zhuo Huang, Xiaobo Xia, Li Shen, Bo Han, Mingming Gong, Chen Gong, Tongliang Liu

Machine learning models are vulnerable to Out-Of-Distribution (OOD) examples, and such a problem has drawn much attention.

Data Augmentation Disentanglement +3

Online Bilevel Optimization: Regret Analysis of Online Alternating Gradient Methods

1 code implementation6 Jul 2022 Davoud Ataee Tarzanagh, Parvin Nazari, BoJian Hou, Li Shen, Laura Balzano

This paper introduces \textit{online bilevel optimization} in which a sequence of time-varying bilevel problems is revealed one after the other.

Bilevel Optimization

Local Sample-weighted Multiple Kernel Clustering with Consensus Discriminative Graph

1 code implementation5 Jul 2022 Liang Li, Siwei Wang, Xinwang Liu, En Zhu, Li Shen, Kenli Li, Keqin Li

Multiple kernel clustering (MKC) is committed to achieving optimal information fusion from a set of base kernels.

Clustering

Dynamic Contrastive Distillation for Image-Text Retrieval

no code implementations4 Jul 2022 Jun Rao, Liang Ding, Shuhan Qi, Meng Fang, Yang Liu, Li Shen, DaCheng Tao

Although the vision-and-language pretraining (VLP) equipped cross-modal image-text retrieval (ITR) has achieved remarkable progress in the past two years, it suffers from a major drawback: the ever-increasing size of VLP models restricts its deployment to real-world search scenarios (where the high latency is unacceptable).

Contrastive Learning Metric Learning +3

Towards Harnessing Feature Embedding for Robust Learning with Noisy Labels

no code implementations27 Jun 2022 Chuang Zhang, Li Shen, Jian Yang, Chen Gong

To exploit this effect, the model prediction-based methods have been widely adopted, which aim to exploit the outputs of DNNs in the early stage of learning to correct noisy labels.

Learning with noisy labels Memorization

SafeRL-Kit: Evaluating Efficient Reinforcement Learning Methods for Safe Autonomous Driving

1 code implementation17 Jun 2022 Linrui Zhang, Qin Zhang, Li Shen, Bo Yuan, Xueqian Wang

Safe reinforcement learning (RL) has achieved significant success on risk-sensitive tasks and shown promise in autonomous driving (AD) as well.

Autonomous Driving reinforcement-learning +2

Understanding Robust Overfitting of Adversarial Training and Beyond

1 code implementation17 Jun 2022 Chaojian Yu, Bo Han, Li Shen, Jun Yu, Chen Gong, Mingming Gong, Tongliang Liu

Here, we explore the causes of robust overfitting by comparing the data distribution of \emph{non-overfit} (weak adversary) and \emph{overfitted} (strong adversary) adversarial training, and observe that the distribution of the adversarial data generated by weak adversary mainly contain small-loss data.

Adversarial Robustness Data Ablation

DisPFL: Towards Communication-Efficient Personalized Federated Learning via Decentralized Sparse Training

1 code implementation1 Jun 2022 Rong Dai, Li Shen, Fengxiang He, Xinmei Tian, DaCheng Tao

In this work, we propose a novel personalized federated learning framework in a decentralized (peer-to-peer) communication protocol named Dis-PFL, which employs personalized sparse masks to customize sparse local models on the edge.

Personalized Federated Learning

Robust Weight Perturbation for Adversarial Training

1 code implementation30 May 2022 Chaojian Yu, Bo Han, Mingming Gong, Li Shen, Shiming Ge, Bo Du, Tongliang Liu

Based on these observations, we propose a robust perturbation strategy to constrain the extent of weight perturbation.

Classification

Few-Shot Adaptation of Pre-Trained Networks for Domain Shift

1 code implementation30 May 2022 Wenyu Zhang, Li Shen, Wanyue Zhang, Chuan-Sheng Foo

Recent test-time adaptation methods update batch normalization layers of pre-trained source models deployed in new target environments with streaming data to mitigate such performance degradation.

domain classification Semantic Segmentation +1

Efficient-Adam: Communication-Efficient Distributed Adam

no code implementations28 May 2022 Congliang Chen, Li Shen, Wei Liu, Zhi-Quan Luo

Distributed adaptive stochastic gradient methods have been widely used for large-scale nonconvex optimization, such as training deep learning models.

Quantization

MissDAG: Causal Discovery in the Presence of Missing Data with Continuous Additive Noise Models

1 code implementation27 May 2022 Erdun Gao, Ignavier Ng, Mingming Gong, Li Shen, Wei Huang, Tongliang Liu, Kun Zhang, Howard Bondell

In this paper, we develop a general method, which we call MissDAG, to perform causal discovery from data with incomplete observations.

Causal Discovery Imputation +1

Penalized Proximal Policy Optimization for Safe Reinforcement Learning

no code implementations24 May 2022 Linrui Zhang, Li Shen, Long Yang, Shixiang Chen, Bo Yuan, Xueqian Wang, DaCheng Tao

Safe reinforcement learning aims to learn the optimal policy while satisfying safety constraints, which is essential in real-world applications.

reinforcement-learning Reinforcement Learning (RL) +1

Interpretable Graph Convolutional Network of Multi-Modality Brain Imaging for Alzheimer's Disease Diagnosis

no code implementations27 Apr 2022 Houliang Zhou, Lifang He, Yu Zhang, Li Shen, Brian Chen

Identification of brain regions related to the specific neurological disorders are of great importance for biomarker and diagnostic studies.

Bridging Cross-Lingual Gaps During Leveraging the Multilingual Sequence-to-Sequence Pretraining for Text Generation and Understanding

1 code implementation16 Apr 2022 Changtong Zan, Liang Ding, Li Shen, Yu Cao, Weifeng Liu, DaCheng Tao

For multilingual sequence-to-sequence pretrained language models (multilingual Seq2Seq PLMs), e. g. mBART, the self-supervised pretraining task is trained on a wide range of monolingual languages, e. g. 25 languages from CommonCrawl, while the downstream cross-lingual tasks generally progress on a bilingual language subset, e. g. English-German, making there exists the data discrepancy, namely domain discrepancy, and cross-lingual learning objective discrepancy, namely task discrepancy, between the pretraining and finetuning stages.

Cross-Lingual Natural Language Inference nlg evaluation +4

Robust Unlearnable Examples: Protecting Data Against Adversarial Learning

2 code implementations28 Mar 2022 Shaopeng Fu, Fengxiang He, Yang Liu, Li Shen, DaCheng Tao

To address this concern, methods are proposed to make data unlearnable for deep learning models by adding a type of error-minimizing noise.

Fine-tuning Global Model via Data-Free Knowledge Distillation for Non-IID Federated Learning

1 code implementation CVPR 2022 Lin Zhang, Li Shen, Liang Ding, DaCheng Tao, Ling-Yu Duan

Instead, we propose a data-free knowledge distillation method to fine-tune the global model in the server (FedFTG), which relieves the issue of direct model aggregation.

Data-free Knowledge Distillation Federated Learning

Depth-Aware Generative Adversarial Network for Talking Head Video Generation

1 code implementation CVPR 2022 Fa-Ting Hong, Longhao Zhang, Li Shen, Dan Xu

In a more dense way, the depth is also utilized to learn 3D-aware cross-modal (i. e. appearance and depth) attention to guide the generation of motion fields for warping source image representations.

Generative Adversarial Network Talking Head Generation +1

Don't Be So Dense: Sparse-to-Sparse GAN Training Without Sacrificing Performance

no code implementations5 Mar 2022 Shiwei Liu, Yuesong Tian, Tianlong Chen, Li Shen

Even more unconventionally, our proposed method enables directly training sparse unbalanced GANs with an extremely sparse generator from scratch.

Model Compression

Provably Efficient Convergence of Primal-Dual Actor-Critic with Nonlinear Function Approximation

no code implementations28 Feb 2022 Jing Dong, Li Shen, Yinggan Xu, Baoxiang Wang

We study the convergence of the actor-critic algorithm with nonlinear function approximation under a nonconvex-nonconcave primal-dual formulation.

Continuous Control OpenAI Gym +1

The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training

1 code implementation ICLR 2022 Shiwei Liu, Tianlong Chen, Xiaohan Chen, Li Shen, Decebal Constantin Mocanu, Zhangyang Wang, Mykola Pechenizkiy

In this paper, we focus on sparse training and highlight a perhaps counter-intuitive finding, that random pruning at initialization can be quite powerful for the sparse training of modern neural networks.

Adversarial Robustness Out-of-Distribution Detection

Achieving Personalized Federated Learning with Sparse Local Models

no code implementations27 Jan 2022 Tiansheng Huang, Shiwei Liu, Li Shen, Fengxiang He, Weiwei Lin, DaCheng Tao

To counter this issue, personalized FL (PFL) was proposed to produce dedicated local models for each individual user.

Personalized Federated Learning

Learning To Learn and Remember Super Long Multi-Domain Task Sequence

1 code implementation CVPR 2022 Zhenyi Wang, Li Shen, Tiehang Duan, Donglin Zhan, Le Fang, Mingchen Gao

We propose a domain shift detection technique to capture latent domain change and equip the meta optimizer with it to work in this setting.

Meta-Learning

DGL-GAN: Discriminator Guided Learning for GAN Compression

no code implementations13 Dec 2021 Yuesong Tian, Li Shen, Xiang Tian, DaCheng Tao, Zhifeng Li, Wei Liu, Yaowu Chen

Moreover, DGL-GAN is also effective in boosting the performance of original uncompressed GANs.

Spatial-Temporal-Fusion BNN: Variational Bayesian Feature Layer

no code implementations12 Dec 2021 Shiye Lei, Zhuozhuo Tu, Leszek Rutkowski, Feng Zhou, Li Shen, Fengxiang He, DaCheng Tao

Bayesian neural networks (BNNs) have become a principal approach to alleviate overconfident predictions in deep learning, but they often suffer from scaling issues due to a large number of distribution parameters.

Adversarial Robustness Uncertainty Quantification +1

FedDAG: Federated DAG Structure Learning

1 code implementation7 Dec 2021 Erdun Gao, Junjia Chen, Li Shen, Tongliang Liu, Mingming Gong, Howard Bondell

To date, most directed acyclic graphs (DAGs) structure learning approaches require data to be stored in a central server.

Causal Discovery

FedCV: A Federated Learning Framework for Diverse Computer Vision Tasks

1 code implementation22 Nov 2021 Chaoyang He, Alay Dilipbhai Shah, Zhenheng Tang, Di Fan1Adarshan Naiynar Sivashunmugam, Keerti Bhogaraju, Mita Shimpi, Li Shen, Xiaowen Chu, Mahdi Soltanolkotabi, Salman Avestimehr

To bridge the gap and facilitate the development of FL for computer vision tasks, in this work, we propose a federated learning library and benchmarking framework, named FedCV, to evaluate FL on the three most representative computer vision tasks: image classification, image segmentation, and object detection.

Benchmarking Federated Learning +5

Off-policy Imitation Learning from Visual Inputs

no code implementations8 Nov 2021 Zhihao Cheng, Li Shen, DaCheng Tao

We propose OPIfVI (Off-Policy Imitation from Visual Inputs), which is composed of an off-policy learning manner, data augmentation, and encoder techniques, to tackle the mentioned challenges, respectively.

Data Augmentation Imitation Learning

On Heterogeneously Distributed Data, Sparsity Matters

no code implementations29 Sep 2021 Tiansheng Huang, Shiwei Liu, Li Shen, Fengxiang He, Weiwei Lin, DaCheng Tao

Federated learning (FL) is particularly vulnerable to heterogeneously distributed data, since a common global model in FL may not adapt to the heterogeneous data distribution of each user.

Personalized Federated Learning

FLBoost: On-the-Fly Fine-tuning Boosts Federated Learning via Data-free Distillation

no code implementations29 Sep 2021 Lin Zhang, Li Shen, Liang Ding, DaCheng Tao, Lingyu Duan

On the contrary, we propose a new solution: on-the-fly fine-tuning the global model in server via data-free distillation to boost its performance, dubbed FLBoost to relieve the issue of direct model aggregation.

Federated Learning

Sparse Unbalanced GAN Training with In-Time Over-Parameterization

no code implementations29 Sep 2021 Shiwei Liu, Yuesong Tian, Tianlong Chen, Li Shen

Perhaps most importantly, we find instead of inheriting parameters from expensive pre-trained GANs, directly training sparse GANs from scratch can be a much more efficient solution.

Model Compression

On Learning to Solve Cardinality Constrained Combinatorial Optimization in One-Shot: A Re-parameterization Approach via Gumbel-Sinkhorn-TopK

no code implementations29 Sep 2021 Runzhong Wang, Li Shen, Yiting Chen, Junchi Yan, Xiaokang Yang, DaCheng Tao

Cardinality constrained combinatorial optimization requires selecting an optimal subset of $k$ elements, and it will be appealing to design data-driven algorithms that perform TopK selection over a probability distribution predicted by a neural network.

Combinatorial Optimization One-Shot Learning +1

Lagrangian Generative Adversarial Imitation Learning with Safety

no code implementations29 Sep 2021 Zhihao Cheng, Li Shen, Meng Fang, Liu Liu, DaCheng Tao

Imitation Learning (IL) merely concentrates on reproducing expert behaviors and could take dangerous actions, which is unbearable in safety-critical scenarios.

Imitation Learning

Source-Free Few-Shot Domain Adaptation

no code implementations29 Sep 2021 Wenyu Zhang, Li Shen, Chuan-Sheng Foo, Wanyue Zhang

Test-time adaptation of pre-trained source models with streaming unlabelled target data is an attractive setting that protects the privacy of source data, but it has mini-batch size and class-distribution requirements on the streaming data which might not be desirable in practice.

domain classification Test-time Adaptation

Robust Unlearnable Examples: Protecting Data Privacy Against Adversarial Learning

no code implementations ICLR 2022 Shaopeng Fu, Fengxiang He, Yang Liu, Li Shen, DaCheng Tao

To address this concern, methods are proposed to make data unlearnable for deep learning models by adding a type of error-minimizing noise.

Enabling variable high spatial resolution retrieval from a long pulse BOTDA sensor

no code implementations9 Sep 2021 Zhao Ge, Li Shen, Can Zhao, Hao Wu, Zhiyong Zhao, Ming Tang

We propose a convolutional neural network (CNN) to process the data of conventional Brillouin optical time domain analysis (BOTDA) sensors, which achieves unprecedented performance improvement that allows to directly retrieve higher spatial resolution (SR) from the sensing system that use long pump pulses.

Retrieval

Learning Meta Representations for Agents in Multi-Agent Reinforcement Learning

no code implementations30 Aug 2021 Shenao Zhang, Lei Han, Li Shen

In multi-agent reinforcement learning, the behaviors that agents learn in a single Markov Game (MG) are typically confined to the given agent number.

Multi-agent Reinforcement Learning reinforcement-learning +1

TCCT: Tightly-Coupled Convolutional Transformer on Time Series Forecasting

2 code implementations29 Aug 2021 Li Shen, Yangzhu Wang

To address this issue, we propose the concept of tightly-coupled convolutional Transformer(TCCT) and three TCCT architectures which apply transformed CNN architectures into Transformer: (1) CSPAttention: through fusing CSPNet with self-attention mechanism, the computation cost of self-attention mechanism is reduced by 30% and the memory usage is reduced by 50% while achieving equivalent or beyond prediction accuracy.

Time Series Time Series Forecasting

End-to-End Adaptive Monte Carlo Denoising and Super-Resolution

no code implementations16 Aug 2021 Xinyue Wei, HaoZhi Huang, Yujin Shi, Hongliang Yuan, Li Shen, Jue Wang

We show in this work that Monte Carlo path tracing can be further accelerated by joint super-resolution and denoising (SRD) in post-processing.

Denoising Super-Resolution

UniFaceGAN: A Unified Framework for Temporally Consistent Facial Video Editing

no code implementations12 Aug 2021 Meng Cao, HaoZhi Huang, Hao Wang, Xuan Wang, Li Shen, Sheng Wang, Linchao Bao, Zhifeng Li, Jiebo Luo

Compared with the state-of-the-art facial image editing methods, our framework generates video portraits that are more photo-realistic and temporally smooth.

3D Reconstruction Face Reenactment +3

S2Looking: A Satellite Side-Looking Dataset for Building Change Detection

1 code implementation20 Jul 2021 Li Shen, Yao Lu, Hao Chen, Hao Wei, Donghai Xie, Jiabao Yue, Rui Chen, Shouye Lv, Bitao Jiang

This paper therefore introduces S2Looking, a building-change-detection dataset that contains large-scale side-looking satellite images captured at various off-nadir angles.

Change Detection Management

Sparse Training via Boosting Pruning Plasticity with Neuroregeneration

2 code implementations NeurIPS 2021 Shiwei Liu, Tianlong Chen, Xiaohan Chen, Zahra Atashgahi, Lu Yin, Huanyu Kou, Li Shen, Mykola Pechenizkiy, Zhangyang Wang, Decebal Constantin Mocanu

Works on lottery ticket hypothesis (LTH) and single-shot network pruning (SNIP) have raised a lot of attention currently on post-training pruning (iterative magnitude pruning), and before-training pruning (pruning at initialization).

Network Pruning Sparse Learning

Local AdaGrad-Type Algorithm for Stochastic Convex-Concave Optimization

no code implementations18 Jun 2021 Luofeng Liao, Li Shen, Jia Duan, Mladen Kolar, DaCheng Tao

Large scale convex-concave minimax problems arise in numerous applications, including game theory, robust training, and training of generative adversarial networks.

Generative Adversarial Network Vocal Bursts Type Prediction

Structure-Regularized Attention for Deformable Object Representation

1 code implementation12 Jun 2021 Shenao Zhang, Li Shen, Zhifeng Li, Wei Liu

Capturing contextual dependencies has proven useful to improve the representational power of deep neural networks.

Object

Attacking Adversarial Attacks as A Defense

no code implementations9 Jun 2021 Boxi Wu, Heng Pan, Li Shen, Jindong Gu, Shuai Zhao, Zhifeng Li, Deng Cai, Xiaofei He, Wei Liu

In this work, we find that the adversarial attacks can also be vulnerable to small perturbations.

Differentiable Neural Architecture Search for Extremely Lightweight Image Super-Resolution

1 code implementation9 May 2021 Han Huang, Li Shen, Chaoyang He, Weisheng Dong, Wei Liu

Specifically, the cell-level search space is designed based on an information distillation mechanism, focusing on the combinations of lightweight operations and aiming to build a more lightweight and accurate SR structure.

Image Super-Resolution Neural Architecture Search +2

Robust Registration of Multimodal Remote Sensing Images Based on Structural Similarity

no code implementations31 Mar 2021 Yuanxin Ye, Jie Shan, Lorenzo Bruzzone, Li Shen

Moreover, a robust registration method is also proposed in this paper based on HOPCncc, which is evaluated using six pairs of multimodal remote sensing images.

Image Registration Template Matching

Towards Practical Adam: Non-Convexity, Convergence Theory, and Mini-Batch Acceleration

no code implementations14 Jan 2021 Congliang Chen, Li Shen, Fangyu Zou, Wei Liu

Adam is one of the most influential adaptive stochastic algorithms for training deep neural networks, which has been pointed out to be divergent even in the simple convex setting via a few simple counterexamples.

Stochastic Optimization

Adaptive Compact Attention For Few-shot Video-to-video Translation

no code implementations30 Nov 2020 Risheng Huang, Li Shen, Xuan Wang, Cheng Lin, Hao-Zhi Huang

This paper proposes an adaptive compact attention model for few-shot video-to-video translation.

Translation

Stochastic Client Selection for Federated Learning with Volatile Clients

no code implementations17 Nov 2020 Tiansheng Huang, Weiwei Lin, Li Shen, Keqin Li, Albert Y. Zomaya

Federated Learning (FL), arising as a privacy-preserving machine learning paradigm, has received notable attention from the public.

Fairness Federated Learning +1

Functional Connectome Fingerprint Gradients in Young Adults

no code implementations10 Nov 2020 Uttara Tipnis, Kausar Abbas, Elizabeth Tran, Enrico Amico, Li Shen, Alan D. Kaplan, Joaquín Goñi

Our differential identifiability results show that the fingerprint gradients based on genetic and environmental similarities are indeed present when comparing FCs for all parcellations and fMRI conditions.

A Distributed Training Algorithm of Generative Adversarial Networks with Quantized Gradients

no code implementations26 Oct 2020 Xiaojun Chen, Shu Yang, Li Shen, Xuanrong Pang

In this paper, we propose a {distributed GANs training algorithm with quantized gradient, dubbed DQGAN,} which is the first distributed training method with quantized gradient for GANs.

Improving the spatial resolution of a BOTDA sensor using deconvolution algorithm

no code implementations15 Sep 2020 Li Shen, Zhiyong Zhao, Can Zhao, Hao Wu, Chao Lu, Ming Tang

The frequency dependency of Brillouin gain temporal envelope is investigated by simulation, and its impact on the recovered results of deconvolution algorithm is thoroughly analyzed.

Denoising

Network reinforcement driven drug repurposing for COVID-19 by exploiting disease-gene-drug associations

no code implementations12 Aug 2020 Yonghyun Nam, Jae-Seung Yun, Seung Mi Lee, Ji Won Park, Ziqi Chen, Brian Lee, Anurag Verma, Xia Ning, Li Shen, Dokyoon Kim

To reduce trial and error in finding treatments for COVID-19, we propose building a network-based drug repurposing framework to prioritize repurposable drugs.

Grouping effects of sparse CCA models in variable selection

no code implementations7 Aug 2020 Kefei Liu, Qi Long, Li Shen

The sparse canonical correlation analysis (SCCA) is a bi-multivariate association model that finds sparse linear combinations of two sets of variables that are maximally correlated with each other.

Variable Selection

Task-agnostic Temporally Consistent Facial Video Editing

no code implementations3 Jul 2020 Meng Cao, Hao-Zhi Huang, Hao Wang, Xuan Wang, Li Shen, Sheng Wang, Linchao Bao, Zhifeng Li, Jiebo Luo

Compared with the state-of-the-art facial image editing methods, our framework generates video portraits that are more photo-realistic and temporally smooth.

3D Reconstruction Video Editing

Real-time Universal Style Transfer on High-resolution Images via Zero-channel Pruning

no code implementations16 Jun 2020 Jie An, Tao Li, Hao-Zhi Huang, Li Shen, Xuan Wang, Yongyi Tang, Jinwen Ma, Wei Liu, Jiebo Luo

Extracting effective deep features to represent content and style information is the key to universal style transfer.

Style Transfer

AlphaGAN: Fully Differentiable Architecture Search for Generative Adversarial Networks

1 code implementation16 Jun 2020 Yuesong Tian, Li Shen, Guinan Su, Zhifeng Li, Wei Liu

To this end, we propose a fully differentiable search framework for generative adversarial networks, dubbed alphaGAN.

CPOT: Channel Pruning via Optimal Transport

no code implementations21 May 2020 Yucong Shen, Li Shen, Hao-Zhi Huang, Xuan Wang, Wei Liu

Recent advances in deep neural networks (DNNs) lead to tremendously growing network parameters, making the deployments of DNNs on platforms with limited resources extremely difficult.

Image-to-Image Translation Translation

Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks

1 code implementation ICML 2020 Zhishuai Guo, Mingrui Liu, Zhuoning Yuan, Li Shen, Wei Liu, Tianbao Yang

In this paper, we study distributed algorithms for large-scale AUC maximization with a deep neural network as a predictive model.

Distributed Optimization

Quantized Adam with Error Feedback

no code implementations29 Apr 2020 Congliang Chen, Li Shen, Hao-Zhi Huang, Wei Liu

In this paper, we present a distributed variant of adaptive stochastic gradient method for training deep neural networks in the parameter-server model.

Quantization

MiLeNAS: Efficient Neural Architecture Search via Mixed-Level Reformulation

1 code implementation CVPR 2020 Chaoyang He, Haishan Ye, Li Shen, Tong Zhang

To remedy this, this paper proposes \mldas, a mixed-level reformulation for NAS that can be optimized efficiently and reliably.

Bilevel Optimization Neural Architecture Search +1

Cognitive Biomarker Prioritization in Alzheimer's Disease using Brain Morphometric Data

no code implementations18 Feb 2020 Bo Peng, Xiaohui Yao, Shannon L. Risacher, Andrew J. Saykin, Li Shen, Xia Ning

This method learns the latent scoring function that pushes the most effective cognitive assessments onto the top of the prioritization list.

Learning-To-Rank

Generalized Embedding Machines for Recommender Systems

no code implementations16 Feb 2020 Enneng Yang, Xin Xin, Li Shen, Guibing Guo

In this work, we propose an alternative approach to model high-order interaction signals in the embedding level, namely Generalized Embedding Machine (GEM).

Recommendation Systems

Adaptive Activation Network and Functional Regularization for Efficient and Flexible Deep Multi-Task Learning

no code implementations19 Nov 2019 Yingru Liu, Xuewen Yang, Dongliang Xie, Xin Wang, Li Shen, Hao-Zhi Huang, Niranjan Balasubramanian

In this paper, we propose a novel deep learning model called Task Adaptive Activation Network (TAAN) that can automatically learn the optimal network architecture for MTL.

Multi-Task Learning

MAP Inference via L2-Sphere Linear Program Reformulation

1 code implementation9 May 2019 Baoyuan Wu, Li Shen, Tong Zhang, Bernard Ghanem

Thus, LS-LP is equivalent to the original MAP inference problem.

valid

Drug-drug interaction prediction based on co-medication patterns and graph matching

no code implementations22 Feb 2019 Wen-Hao Chiang, Li Shen, Lang Li, Xia Ning

Background: The problem of predicting whether a drug combination of arbitrary orders is likely to induce adverse drug reactions is considered in this manuscript.

Graph Matching

A Sufficient Condition for Convergences of Adam and RMSProp

no code implementations CVPR 2019 Fangyu Zou, Li Shen, Zequn Jie, Weizhong Zhang, Wei Liu

Adam and RMSProp are two of the most influential adaptive stochastic algorithms for training deep neural networks, which have been pointed out to be divergent even in the convex setting via a few simple counterexamples.

Stochastic Optimization

Gather-Excite: Exploiting Feature Context in Convolutional Neural Networks

9 code implementations NeurIPS 2018 Jie Hu, Li Shen, Samuel Albanie, Gang Sun, Andrea Vedaldi

We also propose a parametric gather-excite operator pair which yields further performance gains, relate it to the recently-introduced Squeeze-and-Excitation Networks, and analyse the effects of these changes to the CNN feature activation statistics.

A Unified Analysis of AdaGrad with Weighted Aggregation and Momentum Acceleration

no code implementations10 Aug 2018 Li Shen, Congliang Chen, Fangyu Zou, Zequn Jie, Ju Sun, Wei Liu

Integrating adaptive learning rate and momentum techniques into SGD leads to a large class of efficiently accelerated adaptive stochastic algorithms, such as AdaGrad, RMSProp, Adam, AccAdaGrad, \textit{etc}.

Stochastic Optimization

Image Registration and Predictive Modeling: Learning the Metric on the Space of Diffeomorphisms

no code implementations10 Aug 2018 Ayagoz Mussabayeva, Alexey Kroshnin, Anvar Kurmukov, Yulia Dodonova, Li Shen, Shan Cong, Lei Wang, Boris A. Gutman

We present a method for metric optimization in the Large Deformation Diffeomorphic Metric Mapping (LDDMM) framework, by treating the induced Riemannian metric on the space of diffeomorphisms as a kernel in a machine learning context.

General Classification Image Registration

Comparator Networks

no code implementations ECCV 2018 Weidi Xie, Li Shen, Andrew Zisserman

Our contributions are: (i) We propose a Deep Comparator Network (DCN) that can ingest a pair of sets (each may contain a variable number of images) as inputs, and compute a similarity between the pair--this involves attending to multiple discriminative local regions (landmarks), and comparing local descriptors between pairs of faces; (ii) To encourage high-quality representations for each set, internal competition is introduced for recalibration based on the landmark score; (iii) Inspired by image retrieval, a novel hard sample mining regime is proposed to control the sampling process, such that the DCN is complementary to the standard image classification models.

Face Recognition Image Classification +2

An Algorithmic Framework of Variable Metric Over-Relaxed Hybrid Proximal Extra-Gradient Method

no code implementations ICML 2018 Li Shen, Peng Sun, Yitong Wang, Wei Liu, Tong Zhang

Specifically, we find that a large class of primal and primal-dual operator splitting algorithms are all special cases of VMOR-HPE.

Drug Recommendation toward Safe Polypharmacy

no code implementations8 Mar 2018 Wen-Hao Chiang, Li Shen, Lang Li, Xia Ning

Adverse drug reactions (ADRs) induced from high-order drug-drug interactions (DDIs) due to polypharmacy represent a significant public health problem.

A Decomposition Algorithm for the Sparse Generalized Eigenvalue Problem

no code implementations CVPR 2019 Ganzhao Yuan, Li Shen, Wei-Shi Zheng

The sparse generalized eigenvalue problem arises in a number of standard and modern statistical learning models, including sparse principal component analysis, sparse Fisher discriminant analysis, and sparse canonical correlation analysis.

Numerical Analysis

End-to-end Training for Whole Image Breast Cancer Diagnosis using An All Convolutional Design

4 code implementations15 Nov 2017 Li Shen

We also demonstrate that a whole image model trained on DDSM can be easily transferred to INbreast without using its lesion annotations and using only a small amount of training data.

VGGFace2: A dataset for recognising faces across pose and age

24 code implementations23 Oct 2017 Qiong Cao, Li Shen, Weidi Xie, Omkar M. Parkhi, Andrew Zisserman

The dataset was collected with three goals in mind: (i) to have both a large number of identities and also a large number of images for each identity; (ii) to cover a large range of pose, age and ethnicity; and (iii) to minimize the label noise.

 Ranked #1 on Face Verification on IJB-C (training dataset metric)

Face Recognition Face Verification +1

Squeeze-and-Excitation Networks

82 code implementations CVPR 2018 Jie Hu, Li Shen, Samuel Albanie, Gang Sun, Enhua Wu

Squeeze-and-Excitation Networks formed the foundation of our ILSVRC 2017 classification submission which won first place and reduced the top-5 error to 2. 251%, surpassing the winning entry of 2016 by a relative improvement of ~25%.

Image Classification

Deep Learning to Improve Breast Cancer Early Detection on Screening Mammography

5 code implementations30 Aug 2017 Li Shen, Laurie R. Margolies, Joseph H. Rothstein, Eugene Fluder, Russell B. McBride, Weiva Sieh

We also demonstrate that a whole image classifier trained using our end-to-end approach on the DDSM digitized film mammograms can be transferred to INbreast FFDM images using only a subset of the INbreast data for fine-tuning and without further reliance on the availability of lesion annotations.

Breast Cancer Detection Cancer-no cancer per image classification +1

Cannot find the paper you are looking for? You can Submit a new open access paper.