Mixture of Logistic Distributions

Mixture of Logistic Distributions (MoL) is a type of output function, and an alternative to a softmax layer. Discretized logistic mixture likelihood is used in PixelCNN++ and WaveNet to predict discrete values.

Image Credit: Hao Gao

Latest Papers

PAPER DATE
Learning Speaker Embedding from Text-to-Speech
| Jaejin ChoPiotr ZelaskoJesus VillalbaShinji WatanabeNajim Dehak
2020-10-21
The NU Voice Conversion System for the Voice Conversion Challenge 2020: On the Effectiveness of Sequence-to-sequence Models and Autoregressive Neural Vocoders
Wen-Chin HuangPatrick Lumban TobingYi-Chiao WuKazuhiro KobayashiTomoki Toda
2020-10-09
Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration Modeling
Jonathan ShenYe JiaMike ChrzanowskiYu ZhangIsaac EliasHeiga ZenYonghui Wu
2020-10-08
DiffWave: A Versatile Diffusion Model for Audio Synthesis
| Zhifeng KongWei PingJiaji HuangKexin ZhaoBryan Catanzaro
2020-09-21
Applications of BERT Based Sequence Tagging Models on Chinese Medical Text Attributes Extraction
Gang ZhaoTeng ZhangChenxiao WangPing LvJi Wu
2020-08-22
SpeedySpeech: Efficient Neural Speech Synthesis
| Jan VainerOndřej Dušek
2020-08-09
Learning from a Complementary-label Source Domain: Theory and Algorithms
Yiyang ZhangFeng LiuZhen FangBo YuanGuangquan ZhangJie Lu
2020-08-04
One Model, Many Languages: Meta-learning for Multilingual Text-to-Speech
| Tomáš NekvindaOndřej Dušek
2020-08-03
Clarinet: A One-step Approach Towards Budget-friendly Unsupervised Domain Adaptation
| Yiyang ZhangFeng LiuZhen FangBo YuanGuangquan ZhangJie Lu
2020-07-29
Quasi-Periodic WaveNet: An Autoregressive Raw Waveform Generative Model with Pitch-dependent Dilated Convolution Neural Network
| Yi-Chiao WuTomoki HayashiPatrick Lumban TobingKazuhiro KobayashiTomoki Toda
2020-07-11
Articulatory-WaveNet: Autoregressive Model For Acoustic-to-Articulatory Inversion
Narjes BozorgMichael T. Johnson
2020-06-22
HiFi-GAN: High-Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks
| Jiaqi SuZeyu JinAdam Finkelstein
2020-06-10
A non-causal FFTNet architecture for speech enhancement
| Muhammed PV ShifasNagaraj AdigaVassilis TsiarasYannis Stylianou
2020-06-08
CLARINET: A RISC-V Based Framework for Posit Arithmetic Empiricism
Riya JainNiraj SharmaFarhad MerchantSachin PatkarRainer Leupers
2020-05-30
NAUTILUS: a Versatile Voice Cloning System
Hieu-Thi LuongJunichi Yamagishi
2020-05-22
Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis
| Rafael ValleKevin ShihRyan PrengerBryan Catanzaro
2020-05-12
Physics-inspired deep learning to characterize the signal manifold of quasi-circular, spinning, non-precessing binary black hole mergers
Asad KhanE. A. HuertaArnav Das
2020-04-20
FastWave: Accelerating Autoregressive Convolutional Neural Networks on FPGA
Shehzeen HussainMojan JavaheripiPaarth NeekharaRyan KastnerFarinaz Koushanfar
2020-02-09
Closing the Dequantization Gap: PixelCNN as a Single-Layer Flow
| Didrik NielsenOle Winther
2020-02-06
Fully-hierarchical fine-grained prosody modeling for interpretable speech synthesis
Guangzhi SunYu ZhangRon J. WeissYuan CaoHeiga ZenYonghui Wu
2020-02-06
WaveTTS: Tacotron-based TTS with Joint Time-Frequency Domain Loss
Rui LiuBerrak SismanFeilong BaoGuanglai GaoHaizhou Li
2020-02-02
Parallel Neural Text-to-Speech
Kainan PengWei PingZhao SongKexin Zhao
2020-01-01
Probing the phonetic and phonological knowledge of tones in Mandarin TTS models
| Jian Zhu
2019-12-23
Incrementally Improving Graph WaveNet Performance on Traffic Prediction
| Sam ShleiferClara McCreeryVamsi Chitters
2019-12-11
Towards Robust Neural Vocoding for Speech Generation: A Survey
Po-chun HsuChun-hsuan WangAndy T. LiuHung-yi Lee
2019-12-05
WaveFlow: A Compact Flow-based Model for Raw Audio
| Wei PingKainan PengKexin ZhaoZhao Song
2019-12-03
High-quality Speech Synthesis Using Super-resolution Mel-Spectrogram
Leyuan ShengDong-Yan HuangEvgeniy N. Pavlovskiy
2019-12-03
Cross-lingual Multi-speaker Text-to-speech Synthesis for Voice Cloning without Using Parallel Corpus for Unseen Speakers
Zhaoyu LiuBrian Mak
2019-11-26
Speaker independence of neural vocoders and their effect on parametric resynthesis speech enhancement
Soumi MaitiMichael I Mandel
2019-11-14
Seq-U-Net: A One-Dimensional Causal U-Net for Efficient Sequence Modelling
| Daniel StollerMi TianSebastian EwertSimon Dixon
2019-11-14
Transferring neural speech waveform synthesizers to musical instrument sounds generation
Yi ZhaoXin WangLauri JuvelaJunichi Yamagishi
2019-10-27
Mellotron: Multispeaker expressive voice synthesis by conditioning on rhythm, pitch and global style tokens
| Rafael ValleJason LiRyan PrengerBryan Catanzaro
2019-10-26
Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram
| Ryuichi YamamotoEunwoo SongJae-Min Kim
2019-10-25
Label-efficient audio classification through multitask learning and self-supervision
Tyler LeeTing GongSuchismita PadhyAndrew RouditchenkoAnthony Ndirango
2019-10-19
Low Bit-Rate Speech Coding with VQ-VAE and a WaveNet Decoder
Cristina GârbaceaAäron van den OordYazhe LiFelicia S C LimAlejandro LuebsOriol VinyalsThomas C Walters
2019-10-14
GDP: Generalized Device Placement for Dataflow Graphs
Yanqi ZhouSudip RoyAmirali AbdolrashidiDaniel WongPeter C. MaQiumin XuMing ZhongHanxiao LiuAnna GoldieAzalia MirhoseiniJames Laudon
2019-09-28
High Fidelity Speech Synthesis with Adversarial Networks
| Mikołaj BińkowskiJeff DonahueSander DielemanAidan ClarkErich ElsenNorman CasagrandeLuis C. CoboKaren Simonyan
2019-09-25
Multimodal Deep Learning for Mental Disorders Prediction from Audio Speech Samples
Habibeh NaderiBehrouz Haji SoleimaniStan Matwin
2019-09-03
Hierarchical Sequence to Sequence Voice Conversion with Limited Data
Praveen NarayananPunarjay ChakravartyFrancois CharetteGint Puskorius
2019-07-15
Multi-Speaker End-to-End Speech Synthesis
Jihyun ParkKexin ZhaoKainan PengWei Ping
2019-07-09
Towards Debugging Deep Neural Networks by Generating Speech Utterances
| Bilal SoomroAnssi KanervistoTrung Ngo TrongVille Hautamäki
2019-07-06
Deep Learning Based Energy Disaggregation and On/Off Detection of Household Appliances
Jie JiangQiuqiang KongMark PlumbleyNigel Gilbert
2019-07-03
Analysis by Adversarial Synthesis -- A Novel Approach for Speech Vocoding
Ahmed MustafaArijit BiswasChristian BerglerJulia SchottenhammlAndreas Maier
2019-07-01
Parametric Resynthesis with neural vocoders
| Soumi MaitiMichael I Mandel
2019-06-16
Dilated Convolution with Dilated GRU for Music Source Separation
Jen-Yu LiuYi-Hsuan Yang
2019-06-04
Graph WaveNet for Deep Spatial-Temporal Graph Modeling
| Zonghan WuShirui PanGuodong LongJing JiangChengqi Zhang
2019-05-31
FastSpeech: Fast, Robust and Controllable Text to Speech
| Yi RenYangjun RuanXu TanTao QinSheng ZhaoZhou ZhaoTie-Yan Liu
2019-05-22
FastSpeech: Fast,Robustand Controllable Text-to-Speech
| Yi RenYangjun RuanXu TanTao QinSheng ZhaoZhou ZhaoTie-Yan Liu
2019-05-22
Effective parameter estimation methods for an ExcitNet model in generative text-to-speech systems
| Ohsung KwonEunwoo SongJae-Min KimHong-Goo Kang
2019-05-21
CHiVE: Varying Prosody in Speech Synthesis with a Linguistically Driven Dynamic Hierarchical Conditional Variational Network
Vincent WanChun-an ChanTom KenterJakub VitRob Clark
2019-05-17
Universal Adversarial Perturbations for Speech Recognition Systems
Paarth NeekharaShehzeen HussainPrakhar PandeyShlomo DubnovJulian McAuleyFarinaz Koushanfar
2019-05-09
Autoencoder-based Music Translation
Noam MorLior WolfAdam PolyakYaniv Taigman
2019-05-01
Neural source-filter waveform models for statistical parametric speech synthesis
Xin WangShinji TakakiJunichi Yamagishi
2019-04-27
Unsupervised Singing Voice Conversion
Eliya NachmaniLior Wolf
2019-04-13
Probability density distillation with generative adversarial networks for high-quality parallel waveform generation
Ryuichi YamamotoEunwoo SongJae-Min Kim
2019-04-09
GELP: GAN-Excited Linear Prediction for Speech Synthesis from Mel-spectrogram
| Lauri JuvelaBajibabu BollepalliJunichi YamagishiPaavo Alku
2019-04-08
Taco-VC: A Single Speaker Tacotron based Voice Conversion with Limited Data
Roee Levy LeshemRaja Giryes
2019-04-06
WaveCycleGAN2: Time-domain Neural Post-filter for Speech Waveform Generation
Kou TanakaHirokazu KameokaTakuhiro KanekoNobukatsu Hojo
2019-04-05
Joint training framework for text-to-speech and voice conversion using multi-source Tacotron and WaveNet
Mingyang ZhangXin WangFuming FangHaizhou LiJunichi Yamagishi
2019-03-29
A Unified Neural Architecture for Instrumental Audio Tasks
| Steven SpratleyDaniel BeckTrevor Cohn
2019-03-01
GANSynth: Adversarial Neural Audio Synthesis
| Jesse EngelKumar Krishna AgrawalShuo ChenIshaan GulrajaniChris DonahueAdam Roberts
2019-02-23
Wavenilm: A causal neural network for power disaggregation from the complex power signal
Alon HarellStephen MakoninIvan V. Bajić
2019-02-23
Unsupervised speech representation learning using WaveNet autoencoders
| Jan ChorowskiRon J. WeissSamy BengioAäron van den Oord
2019-01-25
Adversarial Signal Denoising with Encoder-Decoder Networks
Leslie CasasAttila KlimmekNassir NavabVasileios Belagiannis
2018-12-20
Generative Adversarial Network based Speaker Adaptation for High Fidelity WaveNet Vocoder
Qiao TianBing YangJing ChenBenlai TangShan Liu
2018-12-06
Refined WaveNet Vocoder for Variational Autoencoder Based Voice Conversion
Wen-Chin HuangYi-Chiao WuHsin-Te HwangPatrick Lumban TobingTomoki HayashiKazuhiro KobayashiTomoki TodaYu TsaoHsin-Min Wang
2018-11-27
TimbreTron: A WaveNet(CycleGAN(CQT(Audio))) Pipeline for Musical Timbre Transfer
| Sicong HuangQiyang LiCem AnilXuchan BaoSageev OoreRoger B. Grosse
2018-11-22
Efficient keyword spotting using dilated convolutions and gating
| Alice CouckeMohammed ChliehThibault GisselbrechtDavid LeroyMathieu PoumeyrolThibaut Lavril
2018-11-19
ExcitNet vocoder: A neural excitation model for parametric speech synthesis systems
Eunwoo SongKyungguen ByunHong-Goo Kang
2018-11-09
Speaker-adaptive neural vocoders for parametric speech synthesis systems
Eunwoo SongJin-Seob KimKyungguen ByunHong-Goo Kang
2018-11-08
FloWaveNet : A Generative Flow for Raw Audio
Sungwon KimSang-gil LeeJongyoon SongSungroh Yoon
2018-11-06
Reconstructing Speech Stimuli From Human Auditory Cortex Activity Using a WaveNet Approach
Ran WangYao WangAdeen Flinker
2018-11-06
WaveGlow: A Flow-based Generative Network for Speech Synthesis
| Ryan PrengerRafael ValleBryan Catanzaro
2018-10-31
Waveform generation for text-to-speech synthesis using pitch-synchronous multi-scale generative adversarial networks
Lauri JuvelaBajibabu BollepalliJunichi YamagishiPaavo Alku
2018-10-30
Speaking style adaptation in Text-To-Speech synthesis using Sequence-to-sequence models with attention
Bajibabu BollepalliLauri JuvelaPaavo Alku
2018-10-29
Neural source-filter-based waveform model for statistical parametric speech synthesis
Xin WangShinji TakakiJunichi Yamagishi
2018-10-29
SING: Symbol-to-Instrument Neural Generator
| Alexandre DéfossezNeil ZeghidourNicolas UsunierLéon BottouFrancis Bach
2018-10-23
A Fully Time-domain Neural Model for Subband-based Speech Synthesizer
Azam RabieeSoo-Young Lee
2018-10-12
Sample Efficient Adaptive Text-to-Speech
Yutian ChenYannis AssaelBrendan ShillingfordDavid BuddenScott ReedHeiga ZenQuan WangLuis C. CoboAndrew TraskBen LaurieCaglar GulcehreAäron van den OordOriol VinyalsNando de Freitas
2018-09-27
Neural Speech Synthesis with Transformer Network
| Naihan LiShujie LiuYanqing LiuSheng ZhaoMing LiuMing Zhou
2018-09-19
Wasserstein GAN and Waveform Loss-based Acoustic Model Training for Multi-speaker Text-to-Speech Synthesis Systems Using a WaveNet Vocoder
Yi ZhaoShinji TakakiHieu-Thi LuongJunichi YamagishiDaisuke SaitoNobuaki Minematsu
2018-07-31
ClariNet: Parallel Wave Generation in End-to-End Text-to-Speech
| Wei PingKainan PengJitong Chen
2018-07-19
Conditioning Deep Generative Raw Audio Models for Structured Automatic Music
Rachel ManzelliVijay ThakkarAli SiahkamariBrian Kulis
2018-06-26
Stochastic WaveNet: A Generative Latent Variable Model for Sequential Data
| Guokun LaiBohan LiGuoqing ZhengYiming Yang
2018-06-15
A Universal Music Translation Network
| Noam MorLior WolfAdam PolyakYaniv Taigman
2018-05-21
Speaker-independent raw waveform model for glottal excitation
Lauri JuvelaVassilis TsiarasBajibabu BollepalliManu AiraksinenJunichi YamagishiPaavo Alku
2018-04-25
A comparison of recent waveform generation and acoustic modeling methods for neural-network-based speech synthesis
Xin WangJaime Lorenzo-TruebaShinji TakakiLauri JuvelaJunichi Yamagishi
2018-04-07
Fast Decoding in Sequence Models using Discrete Latent Variables
Łukasz KaiserAurko RoyAshish VaswaniNiki ParmarSamy BengioJakob UszkoreitNoam Shazeer
2018-03-09
Do WaveNets Dream of Acoustic Waves?
Kanru Hua
2018-02-23
Attacking Speaker Recognition With Deep Generative Models
Wilson CaiAnish DoshiRafael Valle
2018-01-08
Dilated Convolutional Neural Networks for Time Series Forecasting
| Anastasia Borovykh ∗ Sander Bohte † Cornelis W. Oosterlee
2018-01-02
HybridNet: A Hybrid Neural Architecture to Speed-up Autoregressive Models
Yanqi ZhouWei PingSercan ArikKainan PengGreg Diamos
2018-01-01
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions
| Jonathan ShenRuoming PangRon J. WeissMike SchusterNavdeep JaitlyZongheng YangZhifeng ChenYu ZhangYuxuan WangRJ Skerry-RyanRif A. SaurousYannis AgiomyrgiannakisYonghui Wu
2017-12-16
Wavenet based low rate speech coding
W. Bastiaan KleijnFelicia S. C. LimAlejandro LuebsJan SkoglundFlorian StimbergQuan WangThomas C. Walters
2017-12-01
Parallel WaveNet: Fast High-Fidelity Speech Synthesis
| Aaron van den OordYazhe LiIgor BabuschkinKaren SimonyanOriol VinyalsKoray KavukcuogluGeorge van den DriesscheEdward LockhartLuis C. CoboFlorian StimbergNorman CasagrandeDominik GreweSeb NourySander DielemanErich ElsenNal KalchbrennerHeiga ZenAlex GravesHelen KingTom WaltersDan BelovDemis Hassabis
2017-11-28
Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning
| Wei PingKainan PengAndrew GibianskySercan O. ArikAjay KannanSharan NarangJonathan RaimanJohn Miller
2017-10-20
Perceptual audio loss function for deep learning
Dan ElbazMichael Zibulevsky
2017-08-20
Fast Generation for Convolutional Autoregressive Models
| Prajit RamachandranTom Le PainePooya KhorramiMohammad BabaeizadehShiyu ChangYang ZhangMark A. Hasegawa-JohnsonRoy H. CampbellThomas S. Huang
2017-04-20
A Neural Parametric Singing Synthesizer
| Merlijn BlaauwJordi Bonada
2017-04-12
Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders
| Jesse EngelCinjon ResnickAdam RobertsSander DielemanDouglas EckKaren SimonyanMohammad Norouzi
2017-04-05
MidiNet: A Convolutional Generative Adversarial Network for Symbolic-domain Music Generation
| Li-Chia YangSzu-Yu ChouYi-Hsuan Yang
2017-03-31
Conditional Time Series Forecasting with Convolutional Neural Networks
| Anastasia BorovykhSander BohteCornelis W. Oosterlee
2017-03-14
Deep Voice: Real-time Neural Text-to-Speech
| Sercan O. ArikMike ChrzanowskiAdam CoatesGregory DiamosAndrew GibianskyYongguo KangXian LiJohn MillerAndrew NgJonathan RaimanShubho SenguptaMohammad Shoeybi
2017-02-25
Fast Wavenet Generation Algorithm
| Tom Le PainePooya KhorramiShiyu ChangYang ZhangPrajit RamachandranMark A. Hasegawa-JohnsonThomas S. Huang
2016-11-29
WaveNet: A Generative Model for Raw Audio
| Aaron van den OordSander DielemanHeiga ZenKaren SimonyanOriol VinyalsAlex GravesNal KalchbrennerAndrew SeniorKoray Kavukcuoglu
2016-09-12

Tasks

Components

COMPONENT TYPE
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories