Search Results for author: Alexander J. Smola

Found 37 papers, 18 papers with code

Time-Varying Propensity Score to Bridge the Gap between the Past and Present

no code implementations • 4 Oct 2022 • Rasool Fakoor, Jonas Mueller, Zachary C. Lipton, Pratik Chaudhari, Alexander J. Smola

Real-world deployment of machine learning models is challenging because data evolves over time.

Continuous Control Image Classification

Paper
Add Code

Faster Deep Reinforcement Learning with Slower Online Network

1 code implementation • 10 Dec 2021 • Kavosh Asadi, Rasool Fakoor, Omer Gottesman, Taesup Kim, Michael L. Littman, Alexander J. Smola

In this paper we endow two popular deep reinforcement learning algorithms, namely DQN and Rainbow, with updates that incentivize the online network to remain in the proximity of the target network.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Benchmarking Multimodal AutoML for Tabular Data with Text Fields

2 code implementations • 4 Nov 2021 • Xingjian Shi, Jonas Mueller, Nick Erickson, Mu Li, Alexander J. Smola

We consider the use of automated supervised learning systems for data tables that not only contain numeric/categorical columns, but one or more text fields as well.

Ranked #2 on Binary Classification on kickstarter

AutoML Benchmarking +1

7,084

Paper
Code

Deep Explicit Duration Switching Models for Time Series

1 code implementation • NeurIPS 2021 • Abdul Fatir Ansari, Konstantinos Benidis, Richard Kurle, Ali Caner Turkmen, Harold Soh, Alexander J. Smola, Yuyang Wang, Tim Januschowski

We propose the Recurrent Explicit Duration Switching Dynamical System (RED-SDS), a flexible model that is capable of identifying both state- and time-dependent switching dynamics.

Time Series Time Series Analysis

Paper
Code

Dive into Deep Learning

1 code implementation • 21 Jun 2021 • Aston Zhang, Zachary C. Lipton, Mu Li, Alexander J. Smola

This open-source book represents our attempt to make deep learning approachable, teaching readers the concepts, the context, and the code.

Math Multi-Domain Recommender Systems

21,613

Paper
Code

Flexible Model Aggregation for Quantile Regression

1 code implementation • 26 Feb 2021 • Rasool Fakoor, Taesup Kim, Jonas Mueller, Alexander J. Smola, Ryan J. Tibshirani

Quantile regression is a fundamental problem in statistical learning motivated by a need to quantify uncertainty in predictions, or to model a diverse population without being overly reductive.

Econometrics Prediction Intervals +1

Paper
Code

Continuous Doubly Constrained Batch Reinforcement Learning

1 code implementation • NeurIPS 2021 • Rasool Fakoor, Jonas Mueller, Kavosh Asadi, Pratik Chaudhari, Alexander J. Smola

Reliant on too many experiments to learn good actions, current Reinforcement Learning (RL) algorithms have limited applicability in real-world settings, which can be too expensive to allow exploration.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

GraphHINGE: Learning Interaction Models of Structured Neighborhood on Heterogeneous Information Network

1 code implementation • 25 Nov 2020 • Jiarui Jin, Kounianhua Du, Weinan Zhang, Jiarui Qin, Yuchen Fang, Yong Yu, Zheng Zhang, Alexander J. Smola

Heterogeneous information network (HIN) has been widely used to characterize entities of various types and their complex relations.

Click-Through Rate Prediction

Paper
Code

An Efficient Neighborhood-based Interaction Model for Recommendation on Heterogeneous Graph

1 code implementation • 1 Jul 2020 • Jiarui Jin, Jiarui Qin, Yuchen Fang, Kounianhua Du, Wei-Nan Zhang, Yong Yu, Zheng Zhang, Alexander J. Smola

To the best of our knowledge, this is the first work providing an efficient neighborhood-based interaction model in the HIN-based recommendations.

Recommendation Systems

Paper
Code

DDPG++: Striving for Simplicity in Continuous-control Off-Policy Reinforcement Learning

no code implementations • 26 Jun 2020 • Rasool Fakoor, Pratik Chaudhari, Alexander J. Smola

This paper prescribes a suite of techniques for off-policy Reinforcement Learning (RL) that simplify the training process and reduce the sample complexity.

Continuous Control reinforcement-learning +1

Paper
Add Code

Fast, Accurate, and Simple Models for Tabular Data via Augmented Distillation

1 code implementation • NeurIPS 2020 • Rasool Fakoor, Jonas Mueller, Nick Erickson, Pratik Chaudhari, Alexander J. Smola

Automated machine learning (AutoML) can produce complex model ensembles by stacking, bagging, and boosting many individual models like trees, deep networks, and nearest neighbor estimators.

AutoML Data Augmentation

7,084

Paper
Code

TraDE: Transformers for Density Estimation

no code implementations • 6 Apr 2020 • Rasool Fakoor, Pratik Chaudhari, Jonas Mueller, Alexander J. Smola

We present TraDE, a self-attention-based architecture for auto-regressive density estimation with continuous and discrete valued data.

Density Estimation Out-of-Distribution Detection

Paper
Add Code

Transformer on a Diet

1 code implementation • 14 Feb 2020 • Chenguang Wang, Zihao Ye, Aston Zhang, Zheng Zhang, Alexander J. Smola

Transformer has been widely used thanks to its ability to capture sequence information in an efficient way.

Language Modelling

Paper
Code

Meta-Q-Learning

2 code implementations • ICLR 2020 • Rasool Fakoor, Pratik Chaudhari, Stefano Soatto, Alexander J. Smola

This paper introduces Meta-Q-Learning (MQL), a new off-policy algorithm for meta-Reinforcement Learning (meta-RL).

Continuous Control Meta Reinforcement Learning +1

103

Paper
Code

P3O: Policy-on Policy-off Policy Optimization

1 code implementation • 5 May 2019 • Rasool Fakoor, Pratik Chaudhari, Alexander J. Smola

Extensive experiments on the Atari-2600 and MuJoCo benchmark suites show that this simple technique is effective in reducing the sample complexity of state-of-the-art algorithms.

Reinforcement Learning (RL)

Paper
Code

Language Models with Transformers

1 code implementation • arXiv 2019 • Chenguang Wang, Mu Li, Alexander J. Smola

In this paper, we explore effective Transformer architectures for language model, including adding additional LSTM layers to better capture the sequential context while still keeping the computation efficient.

Ranked #2 on Language Modelling on Penn Treebank (Word Level) (using extra training data)

Computational Efficiency Language Modelling +1

Paper
Code

Compressed Video Action Recognition

1 code implementation • CVPR 2018 • Chao-yuan Wu, Manzil Zaheer, Hexiang Hu, R. Manmatha, Alexander J. Smola, Philipp Krähenbühl

), we propose to train a deep network directly on the compressed video.

Ranked #46 on Action Classification on Charades (using extra training data)

Action Classification Action Recognition +2

495

Paper
Code

State Space LSTM Models with Particle MCMC Inference

no code implementations • ICLR 2018 • Xun Zheng, Manzil Zaheer, Amr Ahmed, Yu-An Wang, Eric P. Xing, Alexander J. Smola

Long Short-Term Memory (LSTM) is one of the most powerful sequence models.

Topic Models

Paper
Add Code

Variational Reasoning for Question Answering with Knowledge Graph

1 code implementation • 12 Sep 2017 • Yuyu Zhang, Hanjun Dai, Zornitsa Kozareva, Alexander J. Smola, Le Song

Knowledge graph (KG) is known to be helpful for the task of question answering (QA), since it provides well-structured relational information between entities, and allows one to further infer indirect facts.

Knowledge Graphs Question Answering +1

Paper
Code

A Generic Approach for Escaping Saddle points

no code implementations • 5 Sep 2017 • Sashank J. Reddi, Manzil Zaheer, Suvrit Sra, Barnabas Poczos, Francis Bach, Ruslan Salakhutdinov, Alexander J. Smola

A central challenge to using first-order methods for optimizing nonconvex problems is the presence of saddle points.

Second-order methods

Paper
Add Code

Latent LSTM Allocation: Joint clustering and non-linear dynamic modeling of sequence data

no code implementations • ICML 2017 • Manzil Zaheer, Amr Ahmed, Alexander J. Smola

Recurrent neural networks, such as long-short term memory (LSTM) networks, are powerful tools for modeling sequential data like user browsing history (Tan et al., 2016; Korpusik et al., 2016) or natural language text (Mikolov et al., 2010).

Clustering

Paper
Add Code

Sampling Matters in Deep Embedding Learning

6 code implementations • ICCV 2017 • Chao-yuan Wu, R. Manmatha, Alexander J. Smola, Philipp Krähenbühl

In addition, we show that a simple margin based loss is sufficient to outperform all other loss functions.

Ranked #5 on Image Retrieval on CARS196

Clustering Face Verification +4

262

Paper
Code

Spectral Methods for Nonparametric Models

no code implementations • 31 Mar 2017 • Hsiao-Yu Fish Tung, Chao-yuan Wu, Manzil Zaheer, Alexander J. Smola

Nonparametric models are versatile, albeit computationally expensive, tool for modeling mixture models.

Paper
Add Code

McKernel: A Library for Approximate Kernel Expansions in Log-linear Time

1 code implementation • 27 Feb 2017 • Joachim D. Curtó, Irene C. Zarza, Feng Yang, Alexander J. Smola, Fernando de la Torre, Chong-Wah Ngo, Luc van Gool

The algorithm requires to compute the product of Walsh Hadamard Transform (WHT) matrices.

General Classification

Paper
Code

Recurrent Recommender Networks

no code implementations • WSDM 2017 • Chao-yuan Wu, Amr Ahmed, Alex Beutel, Alexander J. Smola, How Jing

Recommender systems traditionally assume that user profiles and movie attributes are static.

Recommendation Systems

Paper
Add Code

Variance Reduction in Stochastic Gradient Langevin Dynamics

no code implementations • NeurIPS 2016 • Kumar Avinava Dubey, Sashank J. Reddi, Sinead A. Williamson, Barnabas Poczos, Alexander J. Smola, Eric P. Xing

In this paper, we present techniques for reducing variance in stochastic gradient Langevin dynamics, yielding novel stochastic Monte Carlo methods that improve performance by reducing the variance in the stochastic gradient.

BIG-bench Machine Learning

Paper
Add Code

Proximal Stochastic Methods for Nonsmooth Nonconvex Finite-Sum Optimization

no code implementations • NeurIPS 2016 • Sashank J. Reddi, Suvrit Sra, Barnabas Poczos, Alexander J. Smola

We analyze stochastic algorithms for optimizing nonconvex, nonsmooth finite-sum problems, where the nonsmooth part is convex.

Paper
Add Code

Attributing Hacks

1 code implementation • 7 Nov 2016 • Ziqi Liu, Alexander J. Smola, Kyle Soska, Yu-Xiang Wang, Qinghua Zheng, Jun Zhou

That is, given properties of sites and the temporal occurrence of attacks, we are able to attribute individual attacks to joint causes and vulnerabilities, as well as estimating the evolution of these vulnerabilities over time.

Attribute

Paper
Code

Explaining reviews and ratings with PACO: Poisson Additive Co-Clustering

no code implementations • 6 Dec 2015 • Chao-yuan Wu, Alex Beutel, Amr Ahmed, Alexander J. Smola

With this novel technique we propose a new Bayesian model for joint collaborative filtering of ratings and text reviews through a sum of simple co-clusterings.

Clustering Collaborative Filtering

Paper
Add Code

AdaDelay: Delay Adaptive Distributed Stochastic Convex Optimization

no code implementations • 20 Aug 2015 • Suvrit Sra, Adams Wei Yu, Mu Li, Alexander J. Smola

We study distributed stochastic convex optimization under the delayed gradient model where the server nodes perform parameter updates, while the worker nodes compute stochastic gradients.

Paper
Add Code

Graph Partitioning via Parallel Submodular Approximation to Accelerate Distributed Machine Learning

no code implementations • 18 May 2015 • Mu Li, Dave G. Andersen, Alexander J. Smola

Distributed computing excels at processing large scale data, but the communication cost for synchronizing the shared parameters may slow down the overall performance.

BIG-bench Machine Learning Distributed Computing +1

Paper
Add Code

Fast Differentially Private Matrix Factorization

no code implementations • 6 May 2015 • Ziqi Liu, Yu-Xiang Wang, Alexander J. Smola

Differentially private collaborative filtering is a challenging task, both in terms of accuracy and speed.

Collaborative Filtering

Paper
Add Code

ACCAMS: Additive Co-Clustering to Approximate Matrices Succinctly

no code implementations • 31 Dec 2014 • Alex Beutel, Amr Ahmed, Alexander J. Smola

Matrix completion and approximation are popular tools to capture a user's preferences for recommendation and to approximate missing data.

Clustering Decision Making +1

Paper
Add Code

A la Carte - Learning Fast Kernels

no code implementations • 19 Dec 2014 • Zichao Yang, Alexander J. Smola, Le Song, Andrew Gordon Wilson

Kernel methods have great promise for learning rich statistical representations of large modern datasets.

Paper
Add Code

Communication Efficient Distributed Machine Learning with the Parameter Server

no code implementations • NeurIPS 2014 • Mu Li, David G. Andersen, Alexander J. Smola, Kai Yu

This paper describes a third-generation parameter server framework for distributed machine learning.

BIG-bench Machine Learning regression

Paper
Add Code

Spectral Methods for Indian Buffet Process Inference

no code implementations • NeurIPS 2014 • Hsiao-Yu Tung, Alexander J. Smola

The Indian Buffet Process is a versatile statistical tool for modeling distributions over binary matrices.

Paper
Add Code

Variance Reduction for Stochastic Gradient Optimization

no code implementations • NeurIPS 2013 • Chong Wang, Xi Chen, Alexander J. Smola, Eric P. Xing

We demonstrate how to construct the control variate for two practical problems using stochastic gradient optimization.

Variational Inference

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.