Search Results for author: Nobutaka Ono

Found 12 papers, 5 papers with code

Guided Masked Self-Distillation Modeling for Distributed Multimedia Sensor Event Analysis

no code implementations • 12 Apr 2024 • Masahiro Yasuda, Noboru Harada, Yasunori Ohishi, Shoichiro Saito, Akira Nakayama, Nobutaka Ono

This is because the information obtained from a single sensor is often missing or fragmented in such an environment; observations from multiple locations and modalities should be integrated to analyze events comprehensively.

Paper
Add Code

Exploring the Integration of Speech Separation and Recognition with Self-Supervised Learning Representation

no code implementations • 23 Jul 2023 • Yoshiki Masuyama, Xuankai Chang, Wangyou Zhang, Samuele Cornell, Zhong-Qiu Wang, Nobutaka Ono, Yanmin Qian, Shinji Watanabe

In detail, we explore multi-channel separation methods, mask-based beamforming and complex spectral mapping, as well as the best features to use in the ASR back-end model.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

Multi-Channel Target Speaker Extraction with Refinement: The WavLab Submission to the Second Clarity Enhancement Challenge

no code implementations • 15 Feb 2023 • Samuele Cornell, Zhong-Qiu Wang, Yoshiki Masuyama, Shinji Watanabe, Manuel Pariente, Nobutaka Ono

To address the challenges encountered in the CEC2 setting, we introduce four major novelties: (1) we extend the state-of-the-art TF-GridNet model, originally designed for monaural speaker separation, for multi-channel, causal speech enhancement, and large improvements are observed by replacing the TCNDenseNet used in iNeuBe with this new architecture; (2) we leverage a recent dual window size approach with future-frame prediction to ensure that iNueBe-X satisfies the 5 ms constraint on algorithmic latency required by CEC2; (3) we introduce a novel speaker-conditioning branch for TF-GridNet to achieve target speaker extraction; (4) we propose a fine-tuning step, where we compute an additional loss with respect to the target speaker signal compensated with the listener audiogram.

Speaker Separation Speech Enhancement +1

Paper
Add Code

Inverse-free Online Independent Vector Analysis with Flexible Iterative Source Steering

no code implementations • 2 Sep 2022 • Taishi Nakashima, Nobutaka Ono

In this paper, we propose a new online independent vector analysis (IVA) algorithm for real-time blind source separation (BSS).

blind source separation

Paper
Add Code

Joint Optimization of Sampling Rate Offsets Based on Entire Signal Relationship Among Distributed Microphones

no code implementations • 27 Jun 2022 • Yoshiki Masuyama, Kouei Yamaoka, Nobutaka Ono

To address this problem, the proposed method jointly optimizes all SROs based on a probabilistic model of a multichannel signal.

Paper
Add Code

Estimation of Consistent Time Delays in Subsample via Auxiliary-Function-Based Iterative Updates

1 code implementation • 18 Mar 2022 • Kouei Yamaoka, Yukoh Wakabayashi, Nobutaka Ono

We then consider a multidimensional CC (MCC) as the objective function, which is derived on the basis of maximum likelihood estimation.

Paper
Code

Joint Dereverberation and Separation with Iterative Source Steering

no code implementations • 12 Feb 2021 • Taishi Nakashima, Robin Scheibler, Masahito Togami, Nobutaka Ono

In this case, we manage to reduce the number of matrix inversion to only one per iteration and source.

blind source separation

Paper
Add Code

Independent Vector Analysis with more Microphones than Sources

1 code implementation • 20 May 2019 • Robin Scheibler, Nobutaka Ono

The performance of the algorithm is assessed on simulated signals.

Sound Audio and Speech Processing

Paper
Code

Multi-modal Blind Source Separation with Microphones and Blinkies

2 code implementations • 4 Apr 2019 • Robin Scheibler, Nobutaka Ono

We show that alternating updates similar to those of independent vector analysis and Itakura-Saito non-negative matrix factorization decrease the negative log-likelihood of the joint distribution.

Sound Audio and Speech Processing

Paper
Code

Deeply Learned Filter Response Functions for Hyperspectral Reconstruction

no code implementations • CVPR 2018 • Shijie Nie, Lin Gu, Yinqiang Zheng, Antony Lam, Nobutaka Ono, Imari Sato

More interestingly, by considering physical restrictions in the design process, we are able to realize the deeply learned spectral response functions by using modern film filter production technologies, and thus construct data-inspired multispectral cameras for snapshot hyperspectral imaging.

Spectral Reconstruction

Paper
Add Code

A Stochastic Temporal Model of Polyphonic MIDI Performance with Ornaments

1 code implementation • 8 Apr 2014 • Eita Nakamura, Nobutaka Ono, Shigeki Sagayama, Kenji Watanabe

We study indeterminacies in realization of ornaments and how they can be incorporated in a stochastic performance model applicable for music information processing such as score-performance matching.

Paper
Code

Outer-Product Hidden Markov Model and Polyphonic MIDI Score Following

1 code implementation • 8 Apr 2014 • Eita Nakamura, Tomohiko Nakamura, Yasuyuki Saito, Nobutaka Ono, Shigeki Sagayama

We present a polyphonic MIDI score-following algorithm capable of following performances with arbitrary repeats and skips, based on a probabilistic model of musical performances.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.