Search Results for author: Lin Xiao

Found 42 papers, 10 papers with code

Pairwise Instance Relation Augmentation for Long-tailed Multi-label Text Classification

no code implementations19 Nov 2022 Lin Xiao, Pengyu Xu, Liping Jing, Xiangliang Zhang

In response, we propose a Pairwise Instance Relation Augmentation Network (PIRAN) to augment tailed-label documents for balancing tail labels and head labels.

Multi Label Text Classification Multi-Label Text Classification +2

Linear Convergence of Natural Policy Gradient Methods with Log-Linear Policies

no code implementations4 Oct 2022 Rui Yuan, Simon S. Du, Robert M. Gower, Alessandro Lazaric, Lin Xiao

We consider infinite-horizon discounted Markov decision processes and study the convergence rates of the natural policy gradient (NPG) and the Q-NPG methods with the log-linear policy class.

Policy Gradient Methods

Faster Last-iterate Convergence of Policy Optimization in Zero-Sum Markov Games

no code implementations3 Oct 2022 Shicong Cen, Yuejie Chi, Simon S. Du, Lin Xiao

Multi-Agent Reinforcement Learning (MARL) -- where multiple agents learn to interact in a shared dynamic environment -- permeates across a wide range of critical applications.

Multi-agent Reinforcement Learning

Grad-GradaGrad? A Non-Monotone Adaptive Stochastic Gradient Method

no code implementations14 Jun 2022 Aaron Defazio, Baoyu Zhou, Lin Xiao

The classical AdaGrad method adapts the learning rate by dividing by the square root of a sum of squared gradients.

BiT: Robustly Binarized Multi-distilled Transformer

2 code implementations25 May 2022 Zechun Liu, Barlas Oguz, Aasish Pappu, Lin Xiao, Scott Yih, Meng Li, Raghuraman Krishnamoorthi, Yashar Mehdad

Modern pre-trained transformers have rapidly advanced the state-of-the-art in machine learning, but have also grown in parameters and computational complexity, making them increasingly difficult to deploy in resource-constrained environments.

Binarization

On Continual Model Refinement in Out-of-Distribution Data Streams

no code implementations ACL 2022 Bill Yuchen Lin, Sida Wang, Xi Victoria Lin, Robin Jia, Lin Xiao, Xiang Ren, Wen-tau Yih

Real-world natural language processing (NLP) models need to be continually updated to fix the prediction errors in out-of-distribution (OOD) data streams while overcoming catastrophic forgetting.

Benchmarking Continual Learning

FedShuffle: Recipes for Better Use of Local Work in Federated Learning

no code implementations27 Apr 2022 Samuel Horváth, Maziar Sanjabi, Lin Xiao, Peter Richtárik, Michael Rabbat

The practice of applying several local updates before aggregation across clients has been empirically shown to be a successful approach to overcoming the communication bottleneck in Federated Learning (FL).

Federated Learning

Federated Learning with Partial Model Personalization

2 code implementations8 Apr 2022 Krishna Pillutla, Kshitiz Malik, Abdelrahman Mohamed, Michael Rabbat, Maziar Sanjabi, Lin Xiao

We consider two federated learning algorithms for training partially personalized models, where the shared and personal parameters are updated either simultaneously or alternately on the devices.

Federated Learning

On the Convergence Rates of Policy Gradient Methods

no code implementations19 Jan 2022 Lin Xiao

We consider infinite-horizon discounted Markov decision problems with finite state and action spaces and study the convergence rates of the projected policy gradient method and a general class of policy mirror descent methods, all with direct parametrization in the policy space.

Policy Gradient Methods

Importance Estimation from Multiple Perspectives for Keyphrase Extraction

no code implementations EMNLP 2021 Mingyang Song, Liping Jing, Lin Xiao

Keyphrase extraction is a fundamental task in Natural Language Processing, which usually contains two main parts: candidate keyphrase extraction and keyphrase importance estimation.

Chunking Keyphrase Extraction +1

Cross-Platform Simulation Architecture with application to truck platooning impact assessment

no code implementations19 May 2021 Andres Ladino, Lin Xiao, Kingsley Adjenugwhure, Nicolás Deschle, Gerdien Klunder

Simulation-based traffic impact assessment studies of advanced technologies such as truck platooning need to be carried out to ascertain their benefits for traffic efficiency, safety and environment.

Does Head Label Help for Long-Tailed Multi-Label Text Classification

1 code implementation24 Jan 2021 Lin Xiao, Xiangliang Zhang, Liping Jing, Chi Huang, Mingyang Song

To address the challenge of insufficient training data on tail label classification, we propose a Head-to-Tail Network (HTTN) to transfer the meta-knowledge from the data-rich head labels to data-poor tail labels.

General Classification Multi Label Text Classification +2

Improving Self-supervised Pre-training via a Fully-Explored Masked Language Model

no code implementations12 Oct 2020 Mingzhi Zheng, Dinghan Shen, Yelong Shen, Weizhu Chen, Lin Xiao

We prove, from a theoretical perspective, that the gradients derived from this new masking schema have a smaller variance and can lead to more efficient self-supervised training.

Language Modelling Sentence Classification

Hyperbolic Capsule Networks for Multi-Label Classification

no code implementations ACL 2020 Boli Chen, Xin Huang, Lin Xiao, Liping Jing

Second, Hyperbolic Dynamic Routing (HDR) is introduced to aggregate hyperbolic capsules in a label-aware manner, so that the label-level discriminative information can be preserved along the depth of neural networks.

Classification General Classification +1

Jointly Modeling Intra- and Inter-transaction Dependencies with Hierarchical Attentive Transaction Embeddings for Next-item Recommendation

no code implementations30 May 2020 Shoujin Wang, Longbing Cao, Liang Hu, Shlomo Berkovsky, Xiaoshui Huang, Lin Xiao, Wenpeng Lu

Most existing TBRSs recommend next item by only modeling the intra-transaction dependency within the current transaction while ignoring inter-transaction dependency with recent transactions that may also affect the next item.

Recommendation Systems

Statistical Adaptive Stochastic Gradient Methods

1 code implementation25 Feb 2020 Pengchuan Zhang, Hunter Lang, Qiang Liu, Lin Xiao

We propose a statistical adaptive procedure called SALSA for automatically scheduling the learning rate (step size) in stochastic gradient methods.

Scheduling

Statistical Adaptive Stochastic Optimization

no code implementations25 Sep 2019 Pengchuan Zhang, Hunter Lang, Qiang Liu, Lin Xiao

We investigate statistical methods for automatically scheduling the learning rate (step size) in stochastic optimization.

Scheduling Stochastic Optimization

Using Statistics to Automate Stochastic Optimization

no code implementations NeurIPS 2019 Hunter Lang, Pengchuan Zhang, Lin Xiao

Despite the development of numerous adaptive optimizers, tuning the learning rate of stochastic gradient methods remains a major roadblock to obtaining good practical performance in machine learning.

Stochastic Optimization

Multi-Level Composite Stochastic Optimization via Nested Variance Reduction

no code implementations29 Aug 2019 Junyu Zhang, Lin Xiao

We consider multi-level composite optimization problems where each mapping in the composition is the expectation over a family of random smooth mappings or the sum of some finite number of smooth mappings.

Stochastic Optimization

From low probability to high confidence in stochastic convex optimization

no code implementations31 Jul 2019 Damek Davis, Dmitriy Drusvyatskiy, Lin Xiao, Junyu Zhang

Standard results in stochastic convex optimization bound the number of samples that an algorithm needs to generate a point with small function value in expectation.

Stochastic Optimization Vocal Bursts Intensity Prediction

A Stochastic Composite Gradient Method with Incremental Variance Reduction

no code implementations NeurIPS 2019 Junyu Zhang, Lin Xiao

We show that this method achieves the same orders of complexity as the best known first-order methods for minimizing expected-value and finite-sum nonconvex functions, despite the additional outer composition which renders the composite gradient estimator biased.

Hyperbolic Interaction Model For Hierarchical Multi-Label Classification

1 code implementation26 May 2019 Boli Chen, Xin Huang, Lin Xiao, Zixin Cai, Liping Jing

The main reason is that the tree-likeness of the hyperbolic space matches the complexity of symbolic data with hierarchical structures.

Classification General Classification +1

Coupled Variational Bayes via Optimization Embedding

1 code implementation NeurIPS 2018 Bo Dai, Hanjun Dai, Niao He, Weiyang Liu, Zhen Liu, Jianshu Chen, Lin Xiao, Le Song

This flexible function class couples the variational distribution with the original parameters in the graphical models, allowing end-to-end learning of the graphical models by back-propagation through the variational distribution.

Variational Inference

Learning SMaLL Predictors

no code implementations NeurIPS 2018 Vikas K. Garg, Ofer Dekel, Lin Xiao

We present a new machine learning technique for training small resource-constrained predictors.

BIG-bench Machine Learning

SBEED: Convergent Reinforcement Learning with Nonlinear Function Approximation

no code implementations ICML 2018 Bo Dai, Albert Shaw, Lihong Li, Lin Xiao, Niao He, Zhen Liu, Jianshu Chen, Le Song

When function approximation is used, solving the Bellman optimality equation with stability guarantees has remained a major open problem in reinforcement learning for decades.

Q-Learning reinforcement-learning +1

Q-LDA: Uncovering Latent Patterns in Text-based Sequential Decision Processes

no code implementations NeurIPS 2017 Jianshu Chen, Chong Wang, Lin Xiao, Ji He, Lihong Li, Li Deng

In sequential decision making, it is often important and useful for end users to understand the underlying patterns or causes that lead to the corresponding decisions.

Decision Making Q-Learning +2

Stochastic Variance Reduction Methods for Policy Evaluation

no code implementations ICML 2017 Simon S. Du, Jianshu Chen, Lihong Li, Lin Xiao, Dengyong Zhou

Policy evaluation is a crucial step in many reinforcement-learning procedures, which estimates a value function that predicts states' long-term value under a given policy.

Reinforcement Learning (RL)

End-to-end Learning of LDA by Mirror-Descent Back Propagation over a Deep Architecture

1 code implementation NeurIPS 2015 Jianshu Chen, Ji He, Yelong Shen, Lin Xiao, Xiaodong He, Jianfeng Gao, Xinying Song, Li Deng

We develop a fully discriminative learning approach for supervised Latent Dirichlet Allocation (LDA) model using Back Propagation (i. e., BP-sLDA), which maximizes the posterior probability of the prediction variable given the input document.

General Classification Topic Models

Variational Gram Functions: Convex Analysis and Optimization

no code implementations16 Jul 2015 Amin Jalali, Maryam Fazel, Lin Xiao

We propose a new class of convex penalty functions, called \emph{variational Gram functions} (VGFs), that can promote pairwise relations, such as orthogonality, among a set of vectors in a vector space.

General Classification

Communication-Efficient Distributed Optimization of Self-Concordant Empirical Loss

no code implementations1 Jan 2015 Yuchen Zhang, Lin Xiao

We consider distributed convex optimization problems originated from sample average approximation of stochastic optimization, or empirical risk minimization in machine learning.

Binary Classification Distributed Computing +2

An Accelerated Proximal Coordinate Gradient Method

no code implementations NeurIPS 2014 Qihang Lin, Zhaosong Lu, Lin Xiao

We develop an accelerated randomized proximal coordinate gradient (APCG) method, for solving a broad class of composite convex optimization problems.

Stochastic Primal-Dual Coordinate Method for Regularized Empirical Risk Minimization

no code implementations10 Sep 2014 Yuchen Zhang, Lin Xiao

We consider a generic convex optimization problem associated with regularized empirical risk minimization of linear predictors.

A Proximal Stochastic Gradient Method with Progressive Variance Reduction

no code implementations19 Mar 2014 Lin Xiao, Tong Zhang

We consider the problem of minimizing the sum of two convex functions: one is the average of a large number of smooth component functions, and the other is a general convex function that admits a simple proximal mapping.

Online Classification Using a Voted RDA Method

no code implementations17 Oct 2013 Tianbing Xu, Jianfeng Gao, Lin Xiao, Amelia Regan

We propose a voted dual averaging method for online classification problems with explicit regularization.

Classification General Classification

A Randomized Nonmonotone Block Proximal Gradient Method for a Class of Structured Nonlinear Programming

no code implementations25 Jun 2013 Zhaosong Lu, Lin Xiao

When the problem under consideration is convex, we show that the expected objective values generated by RNBPG converge to the optimal value of the problem.

On the Complexity Analysis of Randomized Block-Coordinate Descent Methods

no code implementations21 May 2013 Zhaosong Lu, Lin Xiao

In this paper we analyze the randomized block-coordinate descent (RBCD) methods proposed in [8, 11] for minimizing the sum of a smooth convex function and a block-separable convex function.

Dual Averaging Method for Regularized Stochastic Learning and Online Optimization

no code implementations NeurIPS 2009 Lin Xiao

We consider regularized stochastic learning and online optimization problems, where the objective function is the sum of two convex terms: one is the loss function of the learning task, and the other is a simple regularization term such as L1-norm for sparsity.

Cannot find the paper you are looking for? You can Submit a new open access paper.