Search Results for author: Issei Sato

Found 75 papers, 17 papers with code

Accelerating the diffusion-based ensemble sampling by non-reversible dynamics

no code implementations ICML 2020 Futoshi Futami, Issei Sato, Masashi Sugiyama

Compared with the naive parallel-chain SGLD that updates multiple particles independently, ensemble methods update particles with their interactions.

Bayesian Inference

End-to-End Training Induces Information Bottleneck through Layer-Role Differentiation: A Comparative Analysis with Layer-wise Training

1 code implementation14 Feb 2024 Keitaro Sakamoto, Issei Sato

End-to-end (E2E) training, optimizing the entire model through error backpropagation, fundamentally supports the advancements of deep learning.

Information Plane

Understanding Parameter Saliency via Extreme Value Theory

no code implementations27 Oct 2023 Shuo Wang, Issei Sato

Furthermore, we show that the existing parameter saliency method exhibits a bias against the depth of layers in deep neural networks.

Anomaly Detection Saliency Ranking

Initialization Bias of Fourier Neural Operator: Revisiting the Edge of Chaos

no code implementations10 Oct 2023 Takeshi Koshizuka, Masahiro Fujisawa, Yusuke Tanaka, Issei Sato

Building upon this observation, we also propose an edge of chaos initialization scheme for FNO to mitigate the negative initialization bias leading to training instability.

Are Transformers with One Layer Self-Attention Using Low-Rank Weight Matrices Universal Approximators?

no code implementations26 Jul 2023 Tokio Kajitsuka, Issei Sato

Existing analyses of the expressive capacity of Transformer models have required excessively deep layers for data memorization, leading to a discrepancy with the Transformers actually used in practice.

Memorization

Exploring Weight Balancing on Long-Tailed Recognition Problem

no code implementations26 May 2023 Naoya Hasegawa, Issei Sato

Recognition problems in long-tailed data, in which the sample size per class is heavily skewed, have gained importance because the distribution of the sample size per class in a dataset is generally exponential unless the sample size is intentionally adjusted.

Analyzing Lottery Ticket Hypothesis from PAC-Bayesian Theory Perspective

no code implementations15 May 2022 Keitaro Sakamoto, Issei Sato

The lottery ticket hypothesis (LTH) has attracted attention because it can explain why over-parameterized models often show high generalization ability.

Goldilocks-curriculum Domain Randomization and Fractal Perlin Noise with Application to Sim2Real Pneumonia Lesion Detection

no code implementations29 Apr 2022 Takahiro Suzuki, Shouhei Hanaoka, Issei Sato

An obstacle in developing a CAD system for a disease is that the number of medical images is typically too small to improve the performance of the machine learning model.

BIG-bench Machine Learning Lesion Detection

Empirical Evaluation and Theoretical Analysis for Representation Learning: A Survey

no code implementations18 Apr 2022 Kento Nozawa, Issei Sato

Representation learning enables us to automatically extract generic feature representations from a dataset to solve another machine learning task.

BIG-bench Machine Learning Representation Learning

Neural Lagrangian Schrödinger Bridge: Diffusion Modeling for Population Dynamics

1 code implementation11 Apr 2022 Takeshi Koshizuka, Issei Sato

Population dynamics is the study of temporal and spatial variation in the size of populations of organisms and is a major part of population ecology.

A Closer Look at Prototype Classifier for Few-shot Image Classification

no code implementations11 Oct 2021 Mingcheng Hou, Issei Sato

The prototypical network is a prototype classifier based on meta-learning and is widely used for few-shot learning because it classifies unseen examples by constructing class-specific prototypes without adjusting hyper-parameters during meta-testing.

Few-Shot Image Classification Few-Shot Learning

Adaptive Inertia: Disentangling the Effects of Adaptive Learning Rate and Momentum

no code implementations29 Sep 2021 Zeke Xie, Xinrui Wang, Huishuai Zhang, Issei Sato, Masashi Sugiyama

Specifically, we disentangle the effects of Adaptive Learning Rate and Momentum of the Adam dynamics on saddle-point escaping and flat minima selection.

Disentanglement Analysis with Partial Information Decomposition

no code implementations ICLR 2022 Seiya Tokui, Issei Sato

We propose a framework to analyze how multivariate representations disentangle ground-truth generative factors.

Disentanglement

Toward Neural-Network-Guided Program Synthesis and Verification

1 code implementation17 Mar 2021 Naoki Kobayashi, Taro Sekiyama, Issei Sato, Hiroshi Unno

Another application is to a new program development framework called oracle-based programming, which is a neural-network-guided variation of Solar-Lezama's program synthesis by sketching.

Program Synthesis

Abelian Neural Networks

no code implementations24 Feb 2021 Kenshin Abe, Takanori Maehara, Issei Sato

We study the problem of modeling a binary operation that satisfies some algebraic requirements.

Word Embeddings

Understanding Negative Samples in Instance Discriminative Self-supervised Representation Learning

1 code implementation NeurIPS 2021 Kento Nozawa, Issei Sato

Instance discriminative self-supervised representation learning has been attracted attention thanks to its unsupervised nature and informative feature representation for downstream tasks.

Representation Learning

Binary Classification from Multiple Unlabeled Datasets via Surrogate Set Classification

1 code implementation1 Feb 2021 Nan Lu, Shida Lei, Gang Niu, Issei Sato, Masashi Sugiyama

SSC can be solved by a standard (multi-class) classification method, and we use the SSC solution to obtain the final binary classifier through a certain linear-fractional transformation.

Binary Classification Classification +2

On the Overlooked Pitfalls of Weight Decay and How to Mitigate Them: A Gradient-Norm Perspective

1 code implementation NeurIPS 2023 Zeke Xie, Zhiqiang Xu, Jingzhao Zhang, Issei Sato, Masashi Sugiyama

Weight decay is a simple yet powerful regularization technique that has been very widely used in training of deep neural networks (DNNs).

Artificial Neural Variability for Deep Learning: On Overfitting, Noise Memorization, and Catastrophic Forgetting

1 code implementation12 Nov 2020 Zeke Xie, Fengxiang He, Shaopeng Fu, Issei Sato, DaCheng Tao, Masashi Sugiyama

Thus it motivates us to design a similar mechanism named {\it artificial neural variability} (ANV), which helps artificial neural networks learn some advantages from ``natural'' neural networks.

Memorization

Stable Weight Decay Regularization

no code implementations28 Sep 2020 Zeke Xie, Issei Sato, Masashi Sugiyama

\citet{loshchilov2018decoupled} demonstrated that $L_{2}$ regularization is not identical to weight decay for adaptive gradient methods, such as Adaptive Momentum Estimation (Adam), and proposed Adam with Decoupled Weight Decay (AdamW).

Active Classification with Uncertainty Comparison Queries

1 code implementation3 Aug 2020 Zhenghang Cui, Issei Sato

We then propose an efficient adaptive labeling algorithm using the proposed oracle and the positivity comparison oracle.

Active Learning Classification +1

Diagnostic Uncertainty Calibration: Towards Reliable Machine Predictions in Medical Domain

no code implementations3 Jul 2020 Takahiro Mimori, Keiko Sasada, Hirotaka Matsui, Issei Sato

We propose an evaluation framework for class probability estimates (CPEs) in the presence of label uncertainty, which is commonly observed as diagnosis disagreement between experts in the medical domain.

Adai: Separating the Effects of Adaptive Learning Rate and Momentum Inertia

1 code implementation29 Jun 2020 Zeke Xie, Xinrui Wang, Huishuai Zhang, Issei Sato, Masashi Sugiyama

Specifically, we disentangle the effects of Adaptive Learning Rate and Momentum of the Adam dynamics on saddle-point escaping and minima selection.

LFD-ProtoNet: Prototypical Network Based on Local Fisher Discriminant Analysis for Few-shot Learning

no code implementations15 Jun 2020 Kei Mukaiyama, Issei Sato, Masashi Sugiyama

The prototypical network (ProtoNet) is a few-shot learning framework that performs metric learning and classification using the distance to prototype representations of each class.

Few-Shot Learning General Classification +1

$γ$-ABC: Outlier-Robust Approximate Bayesian Computation Based on a Robust Divergence Estimator

no code implementations13 Jun 2020 Masahiro Fujisawa, Takeshi Teshima, Issei Sato, Masashi Sugiyama

Approximate Bayesian computation (ABC) is a likelihood-free inference method that has been employed in various applications.

Pairwise Supervision Can Provably Elicit a Decision Boundary

no code implementations11 Jun 2020 Han Bao, Takuya Shimada, Liyuan Xu, Issei Sato, Masashi Sugiyama

A classifier built upon the representations is expected to perform well in downstream classification; however, little theory has been given in literature so far and thereby the relationship between similarity and classification has remained elusive.

Binary Classification Classification +5

Sequential Gallery for Interactive Visual Design Optimization

no code implementations8 May 2020 Yuki Koyama, Issei Sato, Masataka Goto

To help users respond to plane-search queries, we also propose using a gallery-based interface that provides options in the two-dimensional subspace arranged in an adaptive grid view.

Bayesian Optimization

Few-shot Domain Adaptation by Causal Mechanism Transfer

1 code implementation ICML 2020 Takeshi Teshima, Issei Sato, Masashi Sugiyama

We take the structural equations in causal modeling as an example and propose a novel DA method, which is shown to be useful both theoretically and experimentally.

Domain Adaptation

Bayesian interpretation of SGD as Ito process

no code implementations20 Nov 2019 Soma Yokoi, Issei Sato

The current interpretation of stochastic gradient descent (SGD) as a stochastic process lacks generality in that its numerical scheme restricts continuous-time dynamics as well as the loss function and the distribution of gradient noise.

Classification from Triplet Comparison Data

1 code implementation24 Jul 2019 Zhenghang Cui, Nontawat Charoenphakdee, Issei Sato, Masashi Sugiyama

Although learning from triplet comparison data has been considered in many applications, an important fundamental question of whether we can learn a classifier only from triplet comparison data has remained unanswered.

Classification General Classification +1

Interactive Optimization of Generative Image Modeling using Sequential Subspace Search and Content-based Guidance

no code implementations24 Jun 2019 Toby Chong Long Hin, I-Chao Shen, Issei Sato, Takeo Igarashi

We present a human-in-the-optimization method that allows users to directly explore and search the latent vector space of generative image modeling.

Image Generation

Solving NP-Hard Problems on Graphs with Extended AlphaGo Zero

2 code implementations28 May 2019 Kenshin Abe, Zijian Xu, Issei Sato, Masashi Sugiyama

There have been increasing challenges to solve combinatorial optimization problems by machine learning.

Combinatorial Optimization Q-Learning

Use of Ghost Cytometry to Differentiate Cells with Similar Gross Morphologic Characteristics

no code implementations22 Mar 2019 Hiroaki Adachi, Yoko Kawamura, Keiji Nakagawa, Ryoichi Horisaki, Issei Sato, Satoko Yamaguchi, Katsuhito Fujiu, Kayo Waki, Hiroyuki Noji, Sadao Ota

Imaging flow cytometry shows significant potential for increasing our understanding of heterogeneous and complex life systems and is useful for biomedical applications.

General Classification

On Learning from Ghost Imaging without Imaging

no code implementations14 Mar 2019 Issei Sato

Computational ghost imaging is an imaging technique in which an object is imaged from light collected using a single-pixel detector with no spatial resolution.

BIG-bench Machine Learning Classification +1

On Transformations in Stochastic Gradient MCMC

no code implementations7 Mar 2019 Soma Yokoi, Takuma Otsuka, Issei Sato

Although SGLD is designed for unbounded random variables, many practical models incorporate variables with boundaries such as non-negative ones or those in a finite interval.

PAC-Bayes Analysis of Sentence Representation

no code implementations12 Feb 2019 Kento Nozawa, Issei Sato

Learning sentence vectors from an unlabeled corpus has attracted attention because such vectors can represent sentences in a lower dimensional and continuous space.

BIG-bench Machine Learning Sentence +1

Online Multiclass Classification Based on Prediction Margin for Partial Feedback

no code implementations4 Feb 2019 Takuo Kaneko, Issei Sato, Masashi Sugiyama

We consider the problem of online multiclass classification with partial feedback, where an algorithm predicts a class for a new instance in each round and only receives its correctness.

Classification General Classification

Multilevel Monte Carlo Variational Inference

no code implementations1 Feb 2019 Masahiro Fujisawa, Issei Sato

We theoretically show that, with our method, the variance of the gradient estimator decreases as optimization proceeds and that a learning rate scheduler function helps improve the convergence.

Stochastic Optimization Variational Inference

Clipped Matrix Completion: A Remedy for Ceiling Effects

no code implementations13 Sep 2018 Takeshi Teshima, Miao Xu, Issei Sato, Masashi Sugiyama

On the other hand, matrix completion (MC) methods can recover a low-rank matrix from various information deficits by using the principle of low-rank completion.

Matrix Completion Recommendation Systems

On the Structural Sensitivity of Deep Convolutional Networks to the Directions of Fourier Basis Functions

no code implementations CVPR 2019 Yusuke Tsuzuku, Issei Sato

Data-agnostic quasi-imperceptible perturbations on inputs are known to degrade recognition accuracy of deep convolutional networks severely.

Unsupervised Domain Adaptation Based on Source-guided Discrepancy

no code implementations11 Sep 2018 Seiichi Kuroki, Nontawat Charoenphakdee, Han Bao, Junya Honda, Issei Sato, Masashi Sugiyama

A previously proposed discrepancy that does not use the source domain labels requires high computational cost to estimate and may lead to a loose generalization error bound in the target domain.

Unsupervised Domain Adaptation

Variational Inference for Gaussian Process with Panel Count Data

no code implementations12 Mar 2018 Hongyi Ding, Young Lee, Issei Sato, Masashi Sugiyama

We present the first framework for Gaussian-process-modulated Poisson processes when the temporal data appear in the form of panel counts.

Variational Inference

Analysis of Minimax Error Rate for Crowdsourcing and Its Application to Worker Clustering Model

1 code implementation ICML 2018 Hideaki Imamura, Issei Sato, Masashi Sugiyama

In this paper, we derive a minimax error rate under more practical setting for a broader class of crowdsourcing models including the DS model as a special case.

Clustering

Gaussian Process Classification with Privileged Information by Soft-to-Hard Labeling Transfer

no code implementations12 Feb 2018 Ryosuke Kamesawa, Issei Sato, Masashi Sugiyama

A state-of-the-art method of Gaussian process classification (GPC) with privileged information is GPC+, which incorporates privileged information into a noise term of the likelihood.

Gaussian Processes General Classification +1

A Quantum-Inspired Ensemble Method and Quantum-Inspired Forest Regressors

no code implementations22 Nov 2017 Zeke Xie, Issei Sato

The contribution of this work is two-fold, a novel ensemble regression algorithm inspired by quantum mechanics and the theoretical connection between quantum interpretations and machine learning algorithms.

regression

Variational Inference based on Robust Divergences

1 code implementation18 Oct 2017 Futoshi Futami, Issei Sato, Masashi Sugiyama

In this paper, based on Zellner's optimization and variational formulation of Bayesian inference, we propose an outlier-robust pseudo-Bayesian variational method by replacing the Kullback-Leibler divergence used for data fitting to a robust divergence such as the beta- and gamma-divergences.

Bayesian Inference Variational Inference

On the Model Shrinkage Effect of Gamma Process Edge Partition Models

no code implementations NeurIPS 2017 Iku Ohama, Issei Sato, Takuya Kida, Hiroki Arimura

In order to ensure that the model shrinkage effect of the EPM works in an appropriate manner, we proposed two novel generative constructions of the EPM: CEPM incorporating constrained gamma priors, and DEPM incorporating Dirichlet priors instead of the gamma priors.

Link Prediction

Evaluating the Variance of Likelihood-Ratio Gradient Estimators

no code implementations ICML 2017 Seiya Tokui, Issei Sato

The framework gives a natural derivation of the optimal estimator that can be interpreted as a special case of the likelihood-ratio method so that we can evaluate the optimal degree of practical techniques with it.

Expectation Propagation for t-Exponential Family Using Q-Algebra

no code implementations NeurIPS 2017 Futoshi Futami, Issei Sato, Masashi Sugiyama

Exponential family distributions are highly useful in machine learning since their calculation can be performed efficiently through natural parameters.

Bayesian Nonparametric Poisson-Process Allocation for Time-Sequence Modeling

1 code implementation19 May 2017 Hongyi Ding, Mohammad Emtiyaz Khan, Issei Sato, Masashi Sugiyama

We model the intensity of each sequence as an infinite mixture of latent functions, each of which is obtained using a function drawn from a Gaussian process.

Variational Inference

Stochastic Divergence Minimization for Biterm Topic Model

no code implementations1 May 2017 Zhenghang Cui, Issei Sato, Masashi Sugiyama

As the emergence and the thriving development of social networks, a huge number of short texts are accumulated and need to be processed.

Topic Models Variational Inference

Convex Formulation of Multiple Instance Learning from Positive and Unlabeled Bags

1 code implementation22 Apr 2017 Han Bao, Tomoya Sakai, Issei Sato, Masashi Sugiyama

Multiple instance learning (MIL) is a variation of traditional supervised learning problems where data (referred to as bags) are composed of sub-elements (referred to as instances) and only bag labels are available.

Content-Based Image Retrieval Multiple Instance Learning +2

Differential Privacy without Sensitivity

no code implementations NeurIPS 2016 Kentaro Minami, Hitomi Arai, Issei Sato, Hiroshi Nakagawa

The exponential mechanism is a general method to construct a randomized estimator that satisfies $(\varepsilon, 0)$-differential privacy.

Does Distributionally Robust Supervised Learning Give Robust Classifiers?

no code implementations ICML 2018 Weihua Hu, Gang Niu, Issei Sato, Masashi Sugiyama

Since the DRSL is explicitly formulated for a distribution shift scenario, we naturally expect it to give a robust classifier that can aggressively handle shifted distributions.

BIG-bench Machine Learning General Classification

Reparameterization trick for discrete variables

no code implementations4 Nov 2016 Seiya Tokui, Issei Sato

Low-variance gradient estimation is crucial for learning directed graphical models parameterized by neural networks, where the reparameterization trick is widely used for those with continuous variables.

Analysis of Variational Bayesian Latent Dirichlet Allocation: Weaker Sparsity Than MAP

no code implementations NeurIPS 2014 Shinichi Nakajima, Issei Sato, Masashi Sugiyama, Kazuho Watanabe, Hiroko Kobayashi

Latent Dirichlet allocation (LDA) is a popular generative model of various objects such as texts and images, where an object is expressed as a mixture of latent topics.

Collapsed Variational Bayes Inference of Infinite Relational Model

no code implementations16 Sep 2014 Katsuhiko Ishiguro, Issei Sato, Naonori Ueda

The Infinite Relational Model (IRM) is a probabilistic model for relational data clustering that partitions objects into clusters based on observed relationships.

Clustering

Quantum Annealing for Variational Bayes Inference

no code implementations9 Aug 2014 Issei Sato, Kenichi Kurihara, Shu Tanaka, Hiroshi Nakagawa, Seiji Miyashita

This paper presents studies on a deterministic annealing algorithm based on quantum annealing for variational Bayes (QAVB) inference, which can be seen as an extension of the simulated annealing for variational Bayes (SAVB) inference.

Quantum Annealing for Dirichlet Process Mixture Models with Applications to Network Clustering

no code implementations19 May 2013 Issei Sato, Shu Tanaka, Kenichi Kurihara, Seiji Miyashita, Hiroshi Nakagawa

We developed a new quantum annealing (QA) algorithm for Dirichlet process mixture (DPM) models based on the Chinese restaurant process (CRP).

Clustering Stochastic Optimization

Deterministic Single-Pass Algorithm for LDA

no code implementations NeurIPS 2010 Issei Sato, Kenichi Kurihara, Hiroshi Nakagawa

We develop a deterministic single-pass algorithm for latent Dirichlet allocation (LDA) in order to process received documents one at a time and then discard them in an excess text stream.

Cannot find the paper you are looking for? You can Submit a new open access paper.