no code implementations • 4 Mar 2024 • Facundo Mémoli, Brantley Vose, Robert C. Williamson
We introduce a notion of distance between supervised learning problems, which we call the Risk distance.
no code implementations • 25 Jan 2024 • Rabanus Derr, Robert C. Williamson
Machine learning has traditionally focused on types of losses and their corresponding regret.
no code implementations • 17 Jul 2023 • Laura Iacovissi, Nan Lu, Robert C. Williamson
Corruption is frequently observed in collected data and has been extensively studied in machine learning under different corruption models.
no code implementations • 26 Jun 2023 • Christian Fröhlich, Robert C. Williamson
Tracing the interaction of uncertainty, fairness and responsibility in insurance provides a fresh perspective on fairness in machine learning.
no code implementations • 23 Feb 2023 • Armando J. Cabrera Pacheco, Robert C. Williamson
Mixable loss functions are of fundamental importance in the context of prediction with expert advice in the online setting since they characterize fast learning rates.
no code implementations • 1 Sep 2022 • Robert C. Williamson, Zac Cranko
In this paper we systematically develop the theory of loss functions for such problems from a novel perspective whose basic ingredients are convex sets with a particular structure.
no code implementations • 5 Aug 2022 • Christian Fröhlich, Robert C. Williamson
As a concrete example, we focus on divergence risk measures based on f-divergence ambiguity sets, which are a widespread tool used to foster distributional robustness of machine learning systems.
no code implementations • 27 Jul 2022 • Rabanus Derr, Robert C. Williamson
In this paper, we dissect the role of statistical independence in fairness and randomness notions regularly used in machine learning.
no code implementations • 25 Jul 2022 • Robert C. Williamson, Zac Cranko
We introduce two new classes of measures of information for statistical experiments which generalise and subsume $\phi$-divergences, integral probability metrics, $\mathfrak{N}$-distances (MMD), and $(f,\Gamma)$ divergences between two or more distributions.
no code implementations • 7 Jun 2022 • Christian Fröhlich, Robert C. Williamson
Machine learning typically presupposes classical probability theory which implies that aggregation is built upon expectation.
no code implementations • 19 May 2022 • Yishay Mansour, Richard Nock, Robert C. Williamson
A landmark negative result of Long and Servedio established a worst-case spectacular failure of a supervised learning trio (loss, algorithm, model) otherwise praised for its high precision machinery.
no code implementations • NeurIPS 2020 • Zakaria Mhammedi, Benjamin Guedj, Robert C. Williamson
Conditional Value at Risk (CVaR) is a family of "coherent risk measures" which generalize the traditional mathematical expectation.
no code implementations • NeurIPS 2019 • Hisham Husain, Richard Nock, Robert C. Williamson
First, we find that the $f$-GAN and WAE objectives partake in a primal-dual relationship and are equivalent under some assumptions, which then allows us to explicate the success of WAE.
no code implementations • 19 Feb 2019 • Zac Cranko, Robert C. Williamson, Richard Nock
The study of a machine learning problem is in many ways is difficult to separate from the study of the loss function being used.
no code implementations • 3 Feb 2019 • Hisham Husain, Richard Nock, Robert C. Williamson
First, we find that the $f$-GAN and WAE objectives partake in a primal-dual relationship and are equivalent under some assumptions, which then allows us to explicate the success of WAE.
1 code implementation • 24 Jan 2019 • Robert C. Williamson, Aditya Krishna Menon
In this paper, we propose a new definition of fairness that generalises some existing proposals, while allowing for generic sensitive features and resulting in a convex objective.
no code implementations • 20 May 2018 • Parameswaran Kamalaruban, Robert C. Williamson
The cost-sensitive classification problem plays a crucial role in mission-critical machine learning applications, and differs with traditional classification by taking the misclassification costs into consideration.
no code implementations • 20 May 2018 • Parameswaran Kamalaruban, Robert C. Williamson, Xinhua Zhang
In special cases like the Aggregating Algorithm (\cite{vovk1995game}) with mixable losses and the Weighted Average Algorithm (\cite{kivinen1999averaging}) with exp-concave losses, it is possible to achieve $O(1)$ regret bounds.
no code implementations • NeurIPS 2018 • Zakaria Mhammedi, Robert C. Williamson
For a given entropy $\Phi$, losses for which a constant regret is possible using the \textsc{GAA} are called $\Phi$-mixable.
1 code implementation • 12 Oct 2017 • Daniel McNamara, Cheng Soon Ong, Robert C. Williamson
These provable properties can be used in a governance model involving a data producer, a data user and a data regulator, where there is a separation of concerns between fairness and target task utility to ensure transparency and prevent perverse incentives.
1 code implementation • NeurIPS 2017 • Richard Nock, Zac Cranko, Aditya Krishna Menon, Lizhen Qu, Robert C. Williamson
In this paper, we unveil a broad class of distributions for which such convergence happens --- namely, deformed exponential families, a wide superset of exponential families --- and show tight connections with the three other key GAN parameters: loss, game and architecture.
no code implementations • 25 May 2017 • Aditya Krishna Menon, Robert C. Williamson
We study the problem of learning classifiers with a fairness constraint, with three main contributions towards the goal of quantifying the problem's inherent tradeoffs.
no code implementations • 9 Nov 2016 • Daniel McNamara, Cheng Soon Ong, Robert C. Williamson
We propose the idea of a risk gap induced by representation learning for a given prediction context, which measures the difference in the risk of some learner using the learned features as compared to the original inputs.
no code implementations • 9 Jul 2015 • Tim van Erven, Peter D. Grünwald, Nishant A. Mehta, Mark D. Reid, Robert C. Williamson
For bounded losses, we show how the central condition enables a direct proof of fast rates and we prove its equivalence to the Bernstein condition, itself a generalization of the Tsybakov margin condition, both of which have played a central role in obtaining fast rates in statistical learning.
no code implementations • 4 Jun 2015 • Brendan van Rooyen, Aditya Krishna Menon, Robert C. Williamson
When working with a high or infinite dimensional kernel, it is imperative for speed of evaluation and storage issues that as few training samples as possible are used in the kernel expansion.
1 code implementation • NeurIPS 2015 • Brendan van Rooyen, Aditya Krishna Menon, Robert C. Williamson
However, Long and Servedio [2010] proved that under symmetric label noise (SLN), minimisation of any convex potential over a linear function class can result in classification performance equivalent to random guessing.
no code implementations • 1 Apr 2015 • Brendan van Rooyen, Robert C. Williamson
Feature Learning aims to extract relevant information contained in data sets in an automated fashion.
no code implementations • 1 Apr 2015 • Brendan van Rooyen, Robert C. Williamson
In this paper we develop a general framework for tackling such problems as well as introducing upper and lower bounds on the risk for learning in the presence of corruption.
no code implementations • 24 Jun 2014 • Mark D. Reid, Rafael M. Frongillo, Robert C. Williamson, Nishant Mehta
Mixability is a property of a loss which characterizes when fast convergence is possible in the game of prediction with expert advice.
no code implementations • NeurIPS 2014 • Nishant A. Mehta, Robert C. Williamson
In the non-statistical prediction with expert advice setting, there is an analogous slow and fast rate phenomenon, and it is entirely characterized in terms of the mixability of the loss $\ell$ (there being no role there for $\mathcal{F}$ or $\mathsf{P}$).
no code implementations • 10 Mar 2014 • Mark D. Reid, Rafael M. Frongillo, Robert C. Williamson
Mixability of a loss is known to characterise when constant regret bounds are achievable in games of prediction with expert advice through the use of Vovk's aggregating algorithm.
no code implementations • 20 Feb 2014 • Brendan van Rooyen, Robert C. Williamson
"Deep Learning" methods attempt to learn generic features in an unsupervised fashion from a large unlabelled data set.
no code implementations • NeurIPS 2012 • Tim V. Erven, Peter Grünwald, Mark D. Reid, Robert C. Williamson
We show that, in the special case of log-loss, stochastic mixability reduces to a well-known (but usually unnamed) martingale condition, which is used in existing convergence theorems for minimum description length and Bayesian inference.
no code implementations • NeurIPS 2011 • Elodie Vernet, Mark D. Reid, Robert C. Williamson
We also show that the integral representation for binary proper losses can not be extended to multiclass losses.