Search Results for author: Kei Sawada

Found 10 papers, 0 papers with code

Release of Pre-Trained Models for the Japanese Language

no code implementations • 2 Apr 2024 • Kei Sawada, Tianyu Zhao, Makoto Shing, Kentaro Mitsui, Akio Kaga, Yukiya Hono, Toshiaki Wakatsuki, Koh Mitsuda

AI democratization aims to create a world in which the average person can utilize AI techniques.

Paper
Add Code

An Integration of Pre-Trained Speech and Language Models for End-to-End Speech Recognition

no code implementations • 6 Dec 2023 • Yukiya Hono, Koh Mitsuda, Tianyu Zhao, Kentaro Mitsui, Toshiaki Wakatsuki, Kei Sawada

Advances in machine learning have made it possible to perform various text and speech processing tasks, including automatic speech recognition (ASR), in an end-to-end (E2E) manner.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Paper
Add Code

Towards human-like spoken dialogue generation between AI agents from written dialogue

no code implementations • 2 Oct 2023 • Kentaro Mitsui, Yukiya Hono, Kei Sawada

The advent of large language models (LLMs) has made it possible to generate natural written dialogues between two agents.

Dialogue Generation

Paper
Add Code

Focused Prefix Tuning for Controllable Text Generation

no code implementations • 1 Jun 2023 • Congda Ma, Tianyu Zhao, Makoto Shing, Kei Sawada, Manabu Okumura

In a controllable text generation dataset, there exist unannotated attributes that could provide irrelevant learning signals to models that use it for training and thus degrade their performance.

Attribute Text Generation

Paper
Add Code

UniFLG: Unified Facial Landmark Generator from Text or Speech

no code implementations • 28 Feb 2023 • Kentaro Mitsui, Yukiya Hono, Kei Sawada

The two primary frameworks used for talking face generation comprise a text-driven framework, which generates synchronized speech and talking faces from text, and a speech-driven framework, which generates talking faces from speech.

Speech Synthesis Talking Face Generation

Paper
Add Code

Text-Guided Scene Sketch-to-Photo Synthesis

no code implementations • 14 Feb 2023 • AprilPyone MaungMaung, Makoto Shing, Kentaro Mitsui, Kei Sawada, Fumio Okura

To this end, we leverage knowledge from recent large-scale pre-trained generative models, resulting in text-guided sketch-to-photo synthesis without the need for reference images.

Self-Supervised Learning

Paper
Add Code

End-to-End Text-to-Speech Based on Latent Representation of Speaking Styles Using Spontaneous Dialogue

no code implementations • 24 Jun 2022 • Kentaro Mitsui, Tianyu Zhao, Kei Sawada, Yukiya Hono, Yoshihiko Nankaku, Keiichi Tokuda

A style encoder that extracts a latent speaking style representation from speech is trained jointly with TTS.

Variational Inference

Paper
Add Code

MSR-NV: Neural Vocoder Using Multiple Sampling Rates

no code implementations • 28 Sep 2021 • Kentaro Mitsui, Kei Sawada

In this study, we propose a method to handle multiple sampling rates in a single NV, called the MSR-NV.

Paper
Add Code

Hierarchical Multi-Grained Generative Model for Expressive Speech Synthesis

no code implementations • 17 Sep 2020 • Yukiya Hono, Kazuna Tsuboi, Kei Sawada, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda

This framework consists of a multi-grained variational autoencoder, a conditional prior, and a multi-level auto-regressive latent converter to obtain the different time-resolution latent variables and sample the finer-level latent variables from the coarser-level ones by taking into account the input text.

Expressive Speech Synthesis Text-To-Speech Synthesis

Paper
Add Code

Dance Revolution: Long-Term Dance Generation with Music via Curriculum Learning

no code implementations • ICLR 2021 • Ruozi Huang, Huang Hu, Wei Wu, Kei Sawada, Mi Zhang, Daxin Jiang

In this paper, we formalize the music-conditioned dance generation as a sequence-to-sequence learning problem and devise a novel seq2seq architecture to efficiently process long sequences of music features and capture the fine-grained correspondence between music and dance.

Ranked #1 on Motion Synthesis on BRACE

Motion Synthesis Pose Estimation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.