Search Results for author: Wittawat Jitkrittum

Found 35 papers, 16 papers with code

Language Model Cascades: Token-level uncertainty and beyond

no code implementations15 Apr 2024 Neha Gupta, Harikrishna Narasimhan, Wittawat Jitkrittum, Ankit Singh Rawat, Aditya Krishna Menon, Sanjiv Kumar

While the principles underpinning cascading are well-studied for classification tasks - with deferral based on predicted class uncertainty favored theoretically and practically - a similar understanding is lacking for generative LM tasks.

Language Modelling

It's an Alignment, Not a Trade-off: Revisiting Bias and Variance in Deep Models

no code implementations13 Oct 2023 Lin Chen, Michal Lukasik, Wittawat Jitkrittum, Chong You, Sanjiv Kumar

Classical wisdom in machine learning holds that the generalization error can be decomposed into bias and variance, and these two terms exhibit a \emph{trade-off}.

When Does Confidence-Based Cascade Deferral Suffice?

no code implementations NeurIPS 2023 Wittawat Jitkrittum, Neha Gupta, Aditya Krishna Menon, Harikrishna Narasimhan, Ankit Singh Rawat, Sanjiv Kumar

Cascades are a classical strategy to enable inference cost to vary adaptively across samples, wherein a sequence of classifiers are invoked in turn.

Plugin estimators for selective classification with out-of-distribution detection

no code implementations29 Jan 2023 Harikrishna Narasimhan, Aditya Krishna Menon, Wittawat Jitkrittum, Sanjiv Kumar

Recent work on selective classification with OOD detection (SCOD) has argued for the unified study of these problems; however, the formal underpinnings of this problem are still nascent, and existing techniques are heuristic in nature.

Out-of-Distribution Detection Out of Distribution (OOD) Detection

A Sketch Is Worth a Thousand Words: Image Retrieval with Text and Sketch

no code implementations5 Aug 2022 Patsorn Sangkloy, Wittawat Jitkrittum, Diyi Yang, James Hays

We empirically demonstrate that using an input sketch (even a poorly drawn one) in addition to text considerably increases retrieval recall compared to traditional text-based image retrieval.

Image Retrieval Retrieval

Discussion of `Multiscale Fisher's Independence Test for Multivariate Dependence'

no code implementations22 Jun 2022 Antonin Schrab, Wittawat Jitkrittum, Zoltán Szabó, Dino Sejdinovic, Arthur Gretton

We discuss how MultiFIT, the Multiscale Fisher's Independence Test for Multivariate Dependence proposed by Gorsky and Ma (2022), compares to existing linear-time kernel tests based on the Hilbert-Schmidt independence criterion (HSIC).

ELM: Embedding and Logit Margins for Long-Tail Learning

no code implementations27 Apr 2022 Wittawat Jitkrittum, Aditya Krishna Menon, Ankit Singh Rawat, Sanjiv Kumar

Long-tail learning is the problem of learning under skewed label distributions, which pose a challenge for standard learners.

Contrastive Learning Long-tail Learning

Disentangling Sampling and Labeling Bias for Learning in Large-Output Spaces

no code implementations12 May 2021 Ankit Singh Rawat, Aditya Krishna Menon, Wittawat Jitkrittum, Sadeep Jayasumana, Felix X. Yu, Sashank Reddi, Sanjiv Kumar

Negative sampling schemes enable efficient training given a large number of classes, by offering a means to approximate a computationally expensive loss function that takes all labels into account.

Retrieval

A Witness Two-Sample Test

1 code implementation10 Feb 2021 Jonas M. Kübler, Wittawat Jitkrittum, Bernhard Schölkopf, Krikamol Muandet

That is, the test set is used to simultaneously estimate the expectations and define the basis points, while the training set only serves to select the kernel and is discarded.

Two-sample testing Vocal Bursts Valence Prediction

Kernel Distributionally Robust Optimization

2 code implementations12 Jun 2020 Jia-Jie Zhu, Wittawat Jitkrittum, Moritz Diehl, Bernhard Schölkopf

We prove a theorem that generalizes the classical duality in the mathematical problem of moments.

Stochastic Optimization

Learning Kernel Tests Without Data Splitting

1 code implementation NeurIPS 2020 Jonas M. Kübler, Wittawat Jitkrittum, Bernhard Schölkopf, Krikamol Muandet

Modern large-scale kernel-based tests such as maximum mean discrepancy (MMD) and kernelized Stein discrepancy (KSD) optimize kernel hyperparameters on a held-out sample via data splitting to obtain the most powerful test statistics.

Worst-Case Risk Quantification under Distributional Ambiguity using Kernel Mean Embedding in Moment Problem

no code implementations31 Mar 2020 Jia-Jie Zhu, Wittawat Jitkrittum, Moritz Diehl, Bernhard Schölkopf

In order to anticipate rare and impactful events, we propose to quantify the worst-case risk under distributional ambiguity using a recent development in kernel methods -- the kernel mean embedding.

Testing Goodness of Fit of Conditional Density Models with Kernels

1 code implementation24 Feb 2020 Wittawat Jitkrittum, Heishiro Kanagawa, Bernhard Schölkopf

We propose two nonparametric statistical tests of goodness of fit for conditional distributions: given a conditional probability density function $p(y|x)$ and a joint sample, decide whether the sample is drawn from $p(y|x)r_x(x)$ for some density $r_x$.

Two-sample testing

Kernel Conditional Moment Test via Maximum Moment Restriction

1 code implementation21 Feb 2020 Krikamol Muandet, Wittawat Jitkrittum, Jonas Kübler

We propose a new family of specification tests called kernel conditional moment (KCM) tests.

Two-sample testing

Kernel-Guided Training of Implicit Generative Models with Stability Guarantees

no code implementations29 Oct 2019 Arash Mehrjou, Wittawat Jitkrittum, Krikamol Muandet, Bernhard Schölkopf

Modern implicit generative models such as generative adversarial networks (GANs) are generally known to suffer from issues such as instability, uninterpretability, and difficulty in assessing their performance.

Kernel Stein Tests for Multiple Model Comparison

3 code implementations NeurIPS 2019 Jen Ning Lim, Makoto Yamada, Bernhard Schölkopf, Wittawat Jitkrittum

The first test, building on the post selection inference framework, provably controls the number of best models that are wrongly declared worse (false positive rate).

More Powerful Selective Kernel Tests for Feature Selection

1 code implementation14 Oct 2019 Jen Ning Lim, Makoto Yamada, Wittawat Jitkrittum, Yoshikazu Terada, Shigeyuki Matsui, Hidetoshi Shimodaira

An approach for addressing this is via conditioning on the selection procedure to account for how we have used the data to generate our hypotheses, and prevent information to be used again after selection.

feature selection Selection bias

ABCDP: Approximate Bayesian Computation with Differential Privacy

no code implementations11 Oct 2019 Mijung Park, Margarita Vinaroz, Wittawat Jitkrittum

SVT incurs the privacy cost only when a condition (whether a quantity of interest is above/below a threshold) is met.

Privacy Preserving

A Kernel Stein Test for Comparing Latent Variable Models

1 code implementation1 Jul 2019 Heishiro Kanagawa, Wittawat Jitkrittum, Lester Mackey, Kenji Fukumizu, Arthur Gretton

We propose a kernel-based nonparametric test of relative goodness of fit, where the goal is to compare two models, both of which may have unobserved latent variables, such that the marginal distribution of the observed variables is intractable.

Kernel Mean Matching for Content Addressability of GANs

1 code implementation14 May 2019 Wittawat Jitkrittum, Patsorn Sangkloy, Muhammad Waleed Gondal, Amit Raj, James Hays, Bernhard Schölkopf

We propose a novel procedure which adds "content-addressability" to any given unconditional implicit model e. g., a generative adversarial network (GAN).

Generative Adversarial Network Image Generation

Kernel-Guided Training of Implicit Generative Models with Stability Guarantees

no code implementations26 Jan 2019 Arash Mehrjou, Wittawat Jitkrittum, Krikamol Muandet, Bernhard Schölkopf

Modern implicit generative models such as generative adversarial networks (GANs) are generally known to suffer from issues such as instability, uninterpretability, and difficulty in assessing their performance.

Informative Features for Model Comparison

3 code implementations NeurIPS 2018 Wittawat Jitkrittum, Heishiro Kanagawa, Patsorn Sangkloy, James Hays, Bernhard Schölkopf, Arthur Gretton

Given two candidate models, and a set of target observations, we address the problem of measuring the relative goodness of fit of the two models.

Fisher Efficient Inference of Intractable Models

1 code implementation NeurIPS 2019 Song Liu, Takafumi Kanamori, Wittawat Jitkrittum, Yu Chen

For example, the asymptotic variance of MLE solution attains equality of the asymptotic Cram{\'e}r-Rao lower bound (efficiency bound), which is the minimum possible variance for an unbiased estimator.

Density Ratio Estimation

Large sample analysis of the median heuristic

1 code implementation23 Jul 2017 Damien Garreau, Wittawat Jitkrittum, Motonobu Kanagawa

In kernel methods, the median heuristic has been widely used as a way of setting the bandwidth of RBF kernels.

A Linear-Time Kernel Goodness-of-Fit Test

4 code implementations NeurIPS 2017 Wittawat Jitkrittum, Wenkai Xu, Zoltan Szabo, Kenji Fukumizu, Arthur Gretton

We propose a novel adaptive test of goodness-of-fit, with computational cost linear in the number of samples.

An Adaptive Test of Independence with Analytic Kernel Embeddings

1 code implementation ICML 2017 Wittawat Jitkrittum, Zoltan Szabo, Arthur Gretton

The dependence measure is the difference between analytic embeddings of the joint distribution and the product of the marginals, evaluated at a finite set of locations (features).

Interpretable Distribution Features with Maximum Testing Power

1 code implementation NeurIPS 2016 Wittawat Jitkrittum, Zoltan Szabo, Kacper Chwialkowski, Arthur Gretton

Two semimetrics on probability distributions are proposed, given as the sum of differences of expectations of analytic functions evaluated at spatial or frequency locations (i. e, features).

Kernel-Based Just-In-Time Learning for Passing Expectation Propagation Messages

1 code implementation9 Mar 2015 Wittawat Jitkrittum, Arthur Gretton, Nicolas Heess, S. M. Ali Eslami, Balaji Lakshminarayanan, Dino Sejdinovic, Zoltán Szabó

We propose an efficient nonparametric strategy for learning a message operator in expectation propagation (EP), which takes as input the set of incoming messages to a factor node, and produces an outgoing message as output.

regression

K2-ABC: Approximate Bayesian Computation with Kernel Embeddings

no code implementations9 Feb 2015 Mijung Park, Wittawat Jitkrittum, Dino Sejdinovic

Complicated generative models often result in a situation where computing the likelihood of observed data is intractable, while simulating from the conditional density given a parameter value is relatively easy.

Passing Expectation Propagation Messages with Kernel Methods

no code implementations2 Jan 2015 Wittawat Jitkrittum, Arthur Gretton, Nicolas Heess

We propose to learn a kernel-based message operator which takes as input all expectation propagation (EP) incoming messages to a factor node and produces an outgoing message.

Bayesian Manifold Learning: The Locally Linear Latent Variable Model (LL-LVM)

no code implementations NeurIPS 2015 Mijung Park, Wittawat Jitkrittum, Ahmad Qamar, Zoltan Szabo, Lars Buesing, Maneesh Sahani

We introduce the Locally Linear Latent Variable Model (LL-LVM), a probabilistic model for non-linear manifold discovery that describes a joint distribution over observations, their manifold coordinates and locally linear maps conditioned on a set of neighbourhood relationships.

High-Dimensional Feature Selection by Feature-Wise Kernelized Lasso

no code implementations2 Feb 2012 Makoto Yamada, Wittawat Jitkrittum, Leonid Sigal, Eric P. Xing, Masashi Sugiyama

We first show that, with particular choices of kernel functions, non-redundant features with strong statistical dependence on output values can be found in terms of kernel-based independence measures.

feature selection Vocal Bursts Intensity Prediction

Cannot find the paper you are looking for? You can Submit a new open access paper.