Search Results for author: Fabien Cardinaux

Found 17 papers, 8 papers with code

LLM meets Vision-Language Models for Zero-Shot One-Class Classification

no code implementations31 Mar 2024 Yassir Bendou, Giulia Lioi, Bastien Pasdeloup, Lukas Mauch, Ghouthi Boukli Hacene, Fabien Cardinaux, Vincent Gripon

In this setting, only the label of the target class is available, and the goal is to discriminate between positive and negative query samples without requiring any validation example from the target task.

One-Class Classification

A Novel Benchmark for Few-Shot Semantic Segmentation in the Era of Foundation Models

1 code implementation20 Jan 2024 Reda Bensaid, Vincent Gripon, François Leduc-Primeau, Lukas Mauch, Ghouthi Boukli Hacene, Fabien Cardinaux

In recent years, the rapid evolution of computer vision has seen the emergence of various foundation models, each tailored to specific data types and tasks.

Few-Shot Semantic Segmentation Segmentation +1

Inferring Latent Class Statistics from Text for Robust Visual Few-Shot Learning

1 code implementation24 Nov 2023 Yassir Bendou, Vincent Gripon, Bastien Pasdeloup, Giulia Lioi, Lukas Mauch, Fabien Cardinaux, Ghouthi Boukli Hacene

In this paper, we present a novel approach that leverages text-derived statistics to predict the mean and covariance of the visual feature distribution for each class.

Few-Shot Learning

DBsurf: A Discrepancy Based Method for Discrete Stochastic Gradient Estimation

no code implementations7 Sep 2023 Pau Mulet Arabi, Alec Flowers, Lukas Mauch, Fabien Cardinaux

Computing gradients of an expectation with respect to the distributional parameters of a discrete distribution is a problem arising in many fields of science and engineering.

Benchmarking Neural Architecture Search

Towards Robust FastSpeech 2 by Modelling Residual Multimodality

1 code implementation2 Jun 2023 Fabian Kögel, Bac Nguyen, Fabien Cardinaux

State-of-the-art non-autoregressive text-to-speech (TTS) models based on FastSpeech 2 can efficiently synthesise high-fidelity and natural speech.

AutoTTS: End-to-End Text-to-Speech Synthesis through Differentiable Duration Modeling

no code implementations21 Mar 2022 Bac Nguyen, Fabien Cardinaux, Stefan Uhlich

Using this differentiable duration method, we introduce AutoTTS, a direct text-to-waveform speech synthesis model.

Speech Synthesis Text-To-Speech Synthesis

NVC-Net: End-to-End Adversarial Voice Conversion

1 code implementation2 Jun 2021 Bac Nguyen, Fabien Cardinaux

By disentangling the speaker identity from the speech content, NVC-Net is able to perform non-parallel traditional many-to-many voice conversion as well as zero-shot voice conversion from a short utterance of an unseen target speaker.

Speech Synthesis Voice Conversion

Neural Network Libraries: A Deep Learning Framework Designed from Engineers' Perspectives

1 code implementation12 Feb 2021 Takuya Narihira, Javier Alonsogarcia, Fabien Cardinaux, Akio Hayakawa, Masato Ishii, Kazunori Iwaki, Thomas Kemp, Yoshiyuki Kobayashi, Lukas Mauch, Akira Nakamura, Yukio Obuchi, Andrew Shin, Kenji Suzuki, Stephen Tiedmann, Stefan Uhlich, Takuya Yashima, Kazuki Yoshiyama

While there exist a plethora of deep learning tools and frameworks, the fast-growing complexity of the field brings new demands and challenges, such as more flexible network design, speedy computation on distributed setting, and compatibility between different tools.

Efficient Sampling for Predictor-Based Neural Architecture Search

no code implementations24 Nov 2020 Lukas Mauch, Stephen Tiedemann, Javier Alonso Garcia, Bac Nguyen Cong, Kazuki Yoshiyama, Fabien Cardinaux, Thomas Kemp

Usually, we compute the proxy for all DNNs in the network search space and pick those that maximize the proxy as candidates for optimization.

Neural Architecture Search

Unsupervised Cross-Domain Speech-to-Speech Conversion with Time-Frequency Consistency

no code implementations15 May 2020 Mohammad Asif Khan, Fabien Cardinaux, Stefan Uhlich, Marc Ferras, Asja Fischer

This procedure bears the problem that the generated magnitude spectrogram may not be consistent, which is required for finding a phase such that the full spectrogram has a natural-sounding speech waveform.

Generative Adversarial Network

Iteratively Training Look-Up Tables for Network Quantization

no code implementations13 Nov 2018 Fabien Cardinaux, Stefan Uhlich, Kazuki Yoshiyama, Javier Alonso García, Stephen Tiedemann, Thomas Kemp, Akira Nakamura

In this paper we introduce a training method, called look-up table quantization, LUT-Q, which learns a dictionary and assigns each weight to one of the dictionary's values.

object-detection Object Detection +1

Cannot find the paper you are looking for? You can Submit a new open access paper.