Search Results for author: Wittawat Jitkrittum

Found 35 papers, 16 papers with code

Language Model Cascades: Token-level uncertainty and beyond

no code implementations • 15 Apr 2024 • Neha Gupta, Harikrishna Narasimhan, Wittawat Jitkrittum, Ankit Singh Rawat, Aditya Krishna Menon, Sanjiv Kumar

While the principles underpinning cascading are well-studied for classification tasks - with deferral based on predicted class uncertainty favored theoretically and practically - a similar understanding is lacking for generative LM tasks.

Language Modelling

Paper
Add Code

It's an Alignment, Not a Trade-off: Revisiting Bias and Variance in Deep Models

no code implementations • 13 Oct 2023 • Lin Chen, Michal Lukasik, Wittawat Jitkrittum, Chong You, Sanjiv Kumar

Classical wisdom in machine learning holds that the generalization error can be decomposed into bias and variance, and these two terms exhibit a \emph{trade-off}.

Paper
Add Code

When Does Confidence-Based Cascade Deferral Suffice?

no code implementations • NeurIPS 2023 • Wittawat Jitkrittum, Neha Gupta, Aditya Krishna Menon, Harikrishna Narasimhan, Ankit Singh Rawat, Sanjiv Kumar

Cascades are a classical strategy to enable inference cost to vary adaptively across samples, wherein a sequence of classifiers are invoked in turn.

Paper
Add Code

Plugin estimators for selective classification with out-of-distribution detection

no code implementations • 29 Jan 2023 • Harikrishna Narasimhan, Aditya Krishna Menon, Wittawat Jitkrittum, Sanjiv Kumar

Recent work on selective classification with OOD detection (SCOD) has argued for the unified study of these problems; however, the formal underpinnings of this problem are still nascent, and existing techniques are heuristic in nature.

Out-of-Distribution Detection Out of Distribution (OOD) Detection

Paper
Add Code

EmbedDistill: A Geometric Knowledge Distillation for Information Retrieval

no code implementations • 27 Jan 2023 • Seungyeon Kim, Ankit Singh Rawat, Manzil Zaheer, Sadeep Jayasumana, Veeranjaneyulu Sadhanala, Wittawat Jitkrittum, Aditya Krishna Menon, Rob Fergus, Sanjiv Kumar

Large neural models (such as Transformers) achieve state-of-the-art performance for information retrieval (IR).

Information Retrieval Knowledge Distillation +2

Paper
Add Code

A Sketch Is Worth a Thousand Words: Image Retrieval with Text and Sketch

no code implementations • 5 Aug 2022 • Patsorn Sangkloy, Wittawat Jitkrittum, Diyi Yang, James Hays

We empirically demonstrate that using an input sketch (even a poorly drawn one) in addition to text considerably increases retrieval recall compared to traditional text-based image retrieval.

Image Retrieval Retrieval

Paper
Add Code

Discussion of `Multiscale Fisher's Independence Test for Multivariate Dependence'

no code implementations • 22 Jun 2022 • Antonin Schrab, Wittawat Jitkrittum, Zoltán Szabó, Dino Sejdinovic, Arthur Gretton

We discuss how MultiFIT, the Multiscale Fisher's Independence Test for Multivariate Dependence proposed by Gorsky and Ma (2022), compares to existing linear-time kernel tests based on the Hilbert-Schmidt independence criterion (HSIC).

Paper
Add Code

ELM: Embedding and Logit Margins for Long-Tail Learning

no code implementations • 27 Apr 2022 • Wittawat Jitkrittum, Aditya Krishna Menon, Ankit Singh Rawat, Sanjiv Kumar

Long-tail learning is the problem of learning under skewed label distributions, which pose a challenge for standard learners.

Contrastive Learning Long-tail Learning

Paper
Add Code

HD-cos Networks: Efficient Neural Architectures for Secure Multi-Party Computation

no code implementations • 28 Oct 2021 • Wittawat Jitkrittum, Michal Lukasik, Ananda Theertha Suresh, Felix Yu, Gang Wang

In this paper, we study training and inference of neural networks under the MPC setup.

Paper
Add Code

HD-cos Networks: Efficient Neural Architechtures for Secure Multi-Party Computation

no code implementations • 29 Sep 2021 • Wittawat Jitkrittum, Michal Lukasik, Ananda Theertha Suresh, Felix Yu, Gang Wang

In this paper, we study training and inference of neural networks under the MPC setup.

Paper
Add Code

Disentangling Sampling and Labeling Bias for Learning in Large-Output Spaces

no code implementations • 12 May 2021 • Ankit Singh Rawat, Aditya Krishna Menon, Wittawat Jitkrittum, Sadeep Jayasumana, Felix X. Yu, Sashank Reddi, Sanjiv Kumar

Negative sampling schemes enable efficient training given a large number of classes, by offering a means to approximate a computationally expensive loss function that takes all labels into account.

Retrieval

Paper
Add Code

A Witness Two-Sample Test

1 code implementation • 10 Feb 2021 • Jonas M. Kübler, Wittawat Jitkrittum, Bernhard Schölkopf, Krikamol Muandet

That is, the test set is used to simultaneously estimate the expectations and define the basis points, while the training set only serves to select the kernel and is discarded.

Two-sample testing Vocal Bursts Valence Prediction

Paper
Code

Kernel Distributionally Robust Optimization

2 code implementations • 12 Jun 2020 • Jia-Jie Zhu, Wittawat Jitkrittum, Moritz Diehl, Bernhard Schölkopf

We prove a theorem that generalizes the classical duality in the mathematical problem of moments.

Stochastic Optimization

Paper
Code

Learning Kernel Tests Without Data Splitting

1 code implementation • NeurIPS 2020 • Jonas M. Kübler, Wittawat Jitkrittum, Bernhard Schölkopf, Krikamol Muandet

Modern large-scale kernel-based tests such as maximum mean discrepancy (MMD) and kernelized Stein discrepancy (KSD) optimize kernel hyperparameters on a held-out sample via data splitting to obtain the most powerful test statistics.

Paper
Code

Worst-Case Risk Quantification under Distributional Ambiguity using Kernel Mean Embedding in Moment Problem

no code implementations • 31 Mar 2020 • Jia-Jie Zhu, Wittawat Jitkrittum, Moritz Diehl, Bernhard Schölkopf

In order to anticipate rare and impactful events, we propose to quantify the worst-case risk under distributional ambiguity using a recent development in kernel methods -- the kernel mean embedding.

Paper
Add Code

Testing Goodness of Fit of Conditional Density Models with Kernels

1 code implementation • 24 Feb 2020 • Wittawat Jitkrittum, Heishiro Kanagawa, Bernhard Schölkopf

We propose two nonparametric statistical tests of goodness of fit for conditional distributions: given a conditional probability density function $p(y|x)$ and a joint sample, decide whether the sample is drawn from $p(y|x)r_x(x)$ for some density $r_x$.

Two-sample testing

Paper
Code

Kernel Conditional Moment Test via Maximum Moment Restriction

1 code implementation • 21 Feb 2020 • Krikamol Muandet, Wittawat Jitkrittum, Jonas Kübler

We propose a new family of specification tests called kernel conditional moment (KCM) tests.

Two-sample testing

Paper
Code

Kernel-Guided Training of Implicit Generative Models with Stability Guarantees

no code implementations • 29 Oct 2019 • Arash Mehrjou, Wittawat Jitkrittum, Krikamol Muandet, Bernhard Schölkopf

Modern implicit generative models such as generative adversarial networks (GANs) are generally known to suffer from issues such as instability, uninterpretability, and difficulty in assessing their performance.

Paper
Add Code

Kernel Stein Tests for Multiple Model Comparison

3 code implementations • NeurIPS 2019 • Jen Ning Lim, Makoto Yamada, Bernhard Schölkopf, Wittawat Jitkrittum

The first test, building on the post selection inference framework, provably controls the number of best models that are wrongly declared worse (false positive rate).

Paper
Code

More Powerful Selective Kernel Tests for Feature Selection

1 code implementation • 14 Oct 2019 • Jen Ning Lim, Makoto Yamada, Wittawat Jitkrittum, Yoshikazu Terada, Shigeyuki Matsui, Hidetoshi Shimodaira

An approach for addressing this is via conditioning on the selection procedure to account for how we have used the data to generate our hypotheses, and prevent information to be used again after selection.

feature selection Selection bias

Paper
Code

ABCDP: Approximate Bayesian Computation with Differential Privacy

no code implementations • 11 Oct 2019 • Mijung Park, Margarita Vinaroz, Wittawat Jitkrittum

SVT incurs the privacy cost only when a condition (whether a quantity of interest is above/below a threshold) is met.

Privacy Preserving

Paper
Add Code

A Kernel Stein Test for Comparing Latent Variable Models

1 code implementation • 1 Jul 2019 • Heishiro Kanagawa, Wittawat Jitkrittum, Lester Mackey, Kenji Fukumizu, Arthur Gretton

We propose a kernel-based nonparametric test of relative goodness of fit, where the goal is to compare two models, both of which may have unobserved latent variables, such that the marginal distribution of the observed variables is intractable.

Paper
Code

Kernel Mean Matching for Content Addressability of GANs

1 code implementation • 14 May 2019 • Wittawat Jitkrittum, Patsorn Sangkloy, Muhammad Waleed Gondal, Amit Raj, James Hays, Bernhard Schölkopf

We propose a novel procedure which adds "content-addressability" to any given unconditional implicit model e. g., a generative adversarial network (GAN).

Generative Adversarial Network Image Generation

Paper
Code

Kernel-Guided Training of Implicit Generative Models with Stability Guarantees

no code implementations • 26 Jan 2019 • Arash Mehrjou, Wittawat Jitkrittum, Krikamol Muandet, Bernhard Schölkopf

Paper
Add Code

Informative Features for Model Comparison

3 code implementations • NeurIPS 2018 • Wittawat Jitkrittum, Heishiro Kanagawa, Patsorn Sangkloy, James Hays, Bernhard Schölkopf, Arthur Gretton

Given two candidate models, and a set of target observations, we address the problem of measuring the relative goodness of fit of the two models.

Paper
Code

Fisher Efficient Inference of Intractable Models

1 code implementation • NeurIPS 2019 • Song Liu, Takafumi Kanamori, Wittawat Jitkrittum, Yu Chen

For example, the asymptotic variance of MLE solution attains equality of the asymptotic Cram{\'e}r-Rao lower bound (efficiency bound), which is the minimum possible variance for an unbiased estimator.

Density Ratio Estimation

Paper
Code

Large sample analysis of the median heuristic

1 code implementation • 23 Jul 2017 • Damien Garreau, Wittawat Jitkrittum, Motonobu Kanagawa

In kernel methods, the median heuristic has been widely used as a way of setting the bandwidth of RBF kernels.

Paper
Code

A Linear-Time Kernel Goodness-of-Fit Test

4 code implementations • NeurIPS 2017 • Wittawat Jitkrittum, Wenkai Xu, Zoltan Szabo, Kenji Fukumizu, Arthur Gretton

We propose a novel adaptive test of goodness-of-fit, with computational cost linear in the number of samples.

Paper
Code

An Adaptive Test of Independence with Analytic Kernel Embeddings

1 code implementation • ICML 2017 • Wittawat Jitkrittum, Zoltan Szabo, Arthur Gretton

The dependence measure is the difference between analytic embeddings of the joint distribution and the product of the marginals, evaluated at a finite set of locations (features).

Paper
Code

Interpretable Distribution Features with Maximum Testing Power

1 code implementation • NeurIPS 2016 • Wittawat Jitkrittum, Zoltan Szabo, Kacper Chwialkowski, Arthur Gretton

Two semimetrics on probability distributions are proposed, given as the sum of differences of expectations of analytic functions evaluated at spatial or frequency locations (i. e, features).

Paper
Code

Kernel-Based Just-In-Time Learning for Passing Expectation Propagation Messages

1 code implementation • 9 Mar 2015 • Wittawat Jitkrittum, Arthur Gretton, Nicolas Heess, S. M. Ali Eslami, Balaji Lakshminarayanan, Dino Sejdinovic, Zoltán Szabó

We propose an efficient nonparametric strategy for learning a message operator in expectation propagation (EP), which takes as input the set of incoming messages to a factor node, and produces an outgoing message as output.

regression

Paper
Code

K2-ABC: Approximate Bayesian Computation with Kernel Embeddings

no code implementations • 9 Feb 2015 • Mijung Park, Wittawat Jitkrittum, Dino Sejdinovic

Complicated generative models often result in a situation where computing the likelihood of observed data is intractable, while simulating from the conditional density given a parameter value is relatively easy.

Paper
Add Code

Passing Expectation Propagation Messages with Kernel Methods

no code implementations • 2 Jan 2015 • Wittawat Jitkrittum, Arthur Gretton, Nicolas Heess

We propose to learn a kernel-based message operator which takes as input all expectation propagation (EP) incoming messages to a factor node and produces an outgoing message.

Paper
Add Code

Bayesian Manifold Learning: The Locally Linear Latent Variable Model (LL-LVM)

no code implementations • NeurIPS 2015 • Mijung Park, Wittawat Jitkrittum, Ahmad Qamar, Zoltan Szabo, Lars Buesing, Maneesh Sahani

We introduce the Locally Linear Latent Variable Model (LL-LVM), a probabilistic model for non-linear manifold discovery that describes a joint distribution over observations, their manifold coordinates and locally linear maps conditioned on a set of neighbourhood relationships.

Paper
Add Code

High-Dimensional Feature Selection by Feature-Wise Kernelized Lasso

no code implementations • 2 Feb 2012 • Makoto Yamada, Wittawat Jitkrittum, Leonid Sigal, Eric P. Xing, Masashi Sugiyama

We first show that, with particular choices of kernel functions, non-redundant features with strong statistical dependence on output values can be found in terms of kernel-based independence measures.

feature selection Vocal Bursts Intensity Prediction

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.