Search Results for author: Nikos Vlassis

Found 13 papers, 3 papers with code

Distributional Off-Policy Evaluation for Slate Recommendations

1 code implementation27 Aug 2023 Shreyas Chaudhari, David Arbour, Georgios Theocharous, Nikos Vlassis

Prior work has developed estimators that leverage the structure in slates to estimate the expected off-policy performance, but the estimation of the entire performance distribution remains elusive.

Fairness Off-policy evaluation

Local Policy Improvement for Recommender Systems

no code implementations22 Dec 2022 Dawen Liang, Nikos Vlassis

The conventional way to address this problem is through importance sampling correction, but this comes with practical limitations.

Causal Inference Self-Supervised Learning +1

Control Variates for Slate Off-Policy Evaluation

1 code implementation NeurIPS 2021 Nikos Vlassis, Ashok Chandrashekar, Fernando Amat Gil, Nathan Kallus

We study the problem of off-policy evaluation from batched contextual bandit data with multidimensional actions, often termed slates.

Off-policy evaluation Recommendation Systems

Off-Policy Evaluation of Slate Policies under Bayes Risk

no code implementations5 Jan 2021 Nikos Vlassis, Fernando Amat Gil, Ashok Chandrashekar

We study the problem of off-policy evaluation for slate bandits, for the typical case in which the logging policy factorizes over the slots of the slate.

Off-policy evaluation

More Efficient Off-Policy Evaluation through Regularized Targeted Learning

no code implementations13 Dec 2019 Aurélien F. Bibaut, Ivana Malenica, Nikos Vlassis, Mark J. Van Der Laan

We study the problem of off-policy evaluation (OPE) in Reinforcement Learning (RL), where the aim is to estimate the performance of a new policy given historical data that may have been generated by a different policy, or policies.

Causal Inference Off-policy evaluation

Scalar Posterior Sampling with Applications

no code implementations NeurIPS 2018 Georgios Theocharous, Zheng Wen, Yasin Abbasi, Nikos Vlassis

Our algorithm termed deterministic schedule PSRL (DS-PSRL) is efficient in terms of time, sample, and space complexity.

Optimizing over a Restricted Policy Class in Markov Decision Processes

no code implementations26 Feb 2018 Ershad Banijamali, Yasin Abbasi-Yadkori, Mohammad Ghavamzadeh, Nikos Vlassis

However, under a condition that is akin to the occupancy measures of the base policies having large overlap, we show that there exists an efficient algorithm that finds a policy that is almost as good as the best convex combination of the base policies.

Policy Gradient Methods

Posterior Sampling for Large Scale Reinforcement Learning

no code implementations21 Nov 2017 Georgios Theocharous, Zheng Wen, Yasin Abbasi-Yadkori, Nikos Vlassis

Our algorithm termed deterministic schedule PSRL (DS-PSRL) is efficient in terms of time, sample, and space complexity.

reinforcement-learning Reinforcement Learning (RL)

Does Weather Matter? Causal Analysis of TV Logs

no code implementations25 Jan 2017 Shi Zong, Branislav Kveton, Shlomo Berkovsky, Azin Ashkan, Nikos Vlassis, Zheng Wen

To the best of our knowledge, this is the first large-scale causal study of the impact of weather on TV watching patterns.

BIG-bench Machine Learning

A posteriori error bounds for joint matrix decomposition problems

no code implementations NeurIPS 2016 Nicolo Colombo, Nikos Vlassis

Joint matrix triangularization is often used for estimating the joint eigenstructure of a set M of matrices, with applications in signal processing and machine learning.

Low-dimensional Data Embedding via Robust Ranking

no code implementations30 Nov 2016 Ehsan Amid, Nikos Vlassis, Manfred K. Warmuth

We describe a new method called t-ETE for finding a low-dimensional embedding of a set of objects in Euclidean space.

Approximate Joint Matrix Triangularization

no code implementations2 Jul 2016 Nicolo Colombo, Nikos Vlassis

The a priori bounds are theoretical inequalities that involve functions of the ground-truth matrices and noise matrices, whereas the a posteriori bounds are given in terms of observable quantities that can be computed from the input matrices.

Tensor Decomposition

Cannot find the paper you are looking for? You can Submit a new open access paper.