Search Results for author: Yuntian Deng

Found 31 papers, 15 papers with code

Sequence-to-Lattice Models for Fast Translation

no code implementations Findings (EMNLP) 2021 Yuntian Deng, Alexander Rush

Non-autoregressive machine translation (NAT) approaches enable fast generation by utilizing parallelizable generative processes.

Machine Translation Translation

Implicit Chain of Thought Reasoning via Knowledge Distillation

1 code implementation2 Nov 2023 Yuntian Deng, Kiran Prasad, Roland Fernandez, Paul Smolensky, Vishrav Chaudhary, Stuart Shieber

In this work, we explore an alternative reasoning approach: instead of explicitly producing the chain of thought reasoning steps, we use the language model's internal hidden states to perform implicit reasoning.

Knowledge Distillation Math

DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies

no code implementations6 Oct 2023 Shuaiwen Leon Song, Bonnie Kruft, Minjia Zhang, Conglong Li, Shiyang Chen, Chengming Zhang, Masahiro Tanaka, Xiaoxia Wu, Jeff Rasley, Ammar Ahmad Awan, Connor Holmes, Martin Cai, Adam Ghanem, Zhongzhu Zhou, Yuxiong He, Pete Luferenko, Divya Kumar, Jonathan Weyn, Ruixiong Zhang, Sylwester Klocek, Volodymyr Vragov, Mohammed AlQuraishi, Gustaf Ahdritz, Christina Floristean, Cristina Negri, Rao Kotamarthi, Venkatram Vishwanath, Arvind Ramanathan, Sam Foreman, Kyle Hippe, Troy Arcomano, Romit Maulik, Maxim Zvyagin, Alexander Brace, Bin Zhang, Cindy Orozco Bohorquez, Austin Clyde, Bharat Kale, Danilo Perez-Rivera, Heng Ma, Carla M. Mann, Michael Irvin, J. Gregory Pauloski, Logan Ward, Valerie Hayot, Murali Emani, Zhen Xie, Diangen Lin, Maulik Shukla, Ian Foster, James J. Davis, Michael E. Papka, Thomas Brettin, Prasanna Balaprakash, Gina Tourassi, John Gounley, Heidi Hanson, Thomas E Potok, Massimiliano Lupo Pasini, Kate Evans, Dan Lu, Dalton Lunga, Junqi Yin, Sajal Dash, Feiyi Wang, Mallikarjun Shankar, Isaac Lyngaas, Xiao Wang, Guojing Cong, Pei Zhang, Ming Fan, Siyan Liu, Adolfy Hoisie, Shinjae Yoo, Yihui Ren, William Tang, Kyle Felker, Alexey Svyatkovskiy, Hang Liu, Ashwin Aji, Angela Dalton, Michael Schulte, Karl Schulz, Yuntian Deng, Weili Nie, Josh Romero, Christian Dallago, Arash Vahdat, Chaowei Xiao, Thomas Gibbs, Anima Anandkumar, Rick Stevens

In the upcoming decade, deep learning may revolutionize the natural sciences, enhancing our capacity to model and predict natural occurrences.

Model Criticism for Long-Form Text Generation

1 code implementation16 Oct 2022 Yuntian Deng, Volodymyr Kuleshov, Alexander M. Rush

Language models have demonstrated the ability to generate highly fluent text; however, it remains unclear whether their output retains coherent high-level structure (e. g., story progression).

Text Generation

Markup-to-Image Diffusion Models with Scheduled Sampling

1 code implementation11 Oct 2022 Yuntian Deng, Noriyuki Kojima, Alexander M. Rush

These experiments each verify the effectiveness of the diffusion process and the use of scheduled sampling to fix generation issues.

Denoising Image Generation +1

Semi-Parametric Inducing Point Networks and Neural Processes

2 code implementations24 May 2022 Richa Rastogi, Yair Schiff, Alon Hacohen, Zhaozhi Li, Ian Lee, Yuntian Deng, Mert R. Sabuncu, Volodymyr Kuleshov

We introduce semi-parametric inducing point networks (SPIN), a general-purpose architecture that can query the training set at inference time in a compute-efficient manner.

Imputation Meta-Learning

Low-Rank Constraints for Fast Inference in Structured Models

1 code implementation NeurIPS 2021 Justin T. Chiu, Yuntian Deng, Alexander M. Rush

This work demonstrates a simple approach to reduce the computational and memory complexity of a large class of structured models.

Language Modelling Music Modeling

Rationales for Sequential Predictions

2 code implementations EMNLP 2021 Keyon Vafa, Yuntian Deng, David M. Blei, Alexander M. Rush

Compared to existing baselines, greedy rationalization is best at optimizing the combinatorial objective and provides the most faithful rationales.

Combinatorial Optimization Language Modelling +2

Weighted Gaussian Process Bandits for Non-stationary Environments

no code implementations6 Jul 2021 Yuntian Deng, Xingyu Zhou, Baekjin Kim, Ambuj Tewari, Abhishek Gupta, Ness Shroff

To this end, we develop WGP-UCB, a novel UCB-type algorithm based on weighted Gaussian process regression.

regression

Incentive Design and Profit Sharing in Multi-modal Transportation Network

no code implementations9 Jan 2021 Yuntian Deng, Shiping Shao, Archak Mittal, Richard Twumasi-Boakye, James Fishelson, Abhishek Gupta, Ness B. Shroff

Accordingly, in this paper, we use cooperative game theory coupled with the hyperpath-based stochastic user equilibrium framework to study such a market.

Cascaded Text Generation with Markov Transformers

1 code implementation NeurIPS 2020 Yuntian Deng, Alexander M. Rush

The two dominant approaches to neural text generation are fully autoregressive models, using serial beam search decoding, and non-autoregressive models, using parallel decoding with no output dependencies.

Machine Translation Text Generation +1

Residual Energy-Based Models for Text Generation

1 code implementation ICLR 2020 Yuntian Deng, Anton Bakhtin, Myle Ott, Arthur Szlam, Marc'Aurelio Ranzato

In this work, we investigate un-normalized energy-based models (EBMs) which operate not at the token but at the sequence level.

Language Modelling Machine Translation +2

Residual Energy-Based Models for Text

no code implementations6 Apr 2020 Anton Bakhtin, Yuntian Deng, Sam Gross, Myle Ott, Marc'Aurelio Ranzato, Arthur Szlam

Current large-scale auto-regressive language models display impressive fluency and can generate convincing text.

AdaptivFloat: A Floating-point based Data Type for Resilient Deep Learning Inference

no code implementations29 Sep 2019 Thierry Tambe, En-Yu Yang, Zishen Wan, Yuntian Deng, Vijay Janapa Reddi, Alexander Rush, David Brooks, Gu-Yeon Wei

Conventional hardware-friendly quantization methods, such as fixed-point or integer, tend to perform poorly at very low word sizes as their shrinking dynamic ranges cannot adequately capture the wide data distributions commonly seen in sequence transduction models.

Quantization

Neural Linguistic Steganography

1 code implementation IJCNLP 2019 Zachary M. Ziegler, Yuntian Deng, Alexander M. Rush

Whereas traditional cryptography encrypts a secret message into an unintelligible form, steganography conceals that communication is taking place by encoding a secret message into a cover signal.

Language Modelling Linguistic steganography

Latent Alignment and Variational Attention

1 code implementation NeurIPS 2018 Yuntian Deng, Yoon Kim, Justin Chiu, Demi Guo, Alexander M. Rush

This work considers variational attention networks, alternatives to soft and hard attention for learning latent variable alignment models, with tighter approximation bounds based on amortized variational inference.

Hard Attention Machine Translation +4

OpenNMT: Open-source Toolkit for Neural Machine Translation

no code implementations12 Sep 2017 Guillaume Klein, Yoon Kim, Yuntian Deng, Josep Crego, Jean Senellart, Alexander M. Rush

We introduce an open-source toolkit for neural machine translation (NMT) to support research into model architectures, feature representations, and source modalities, while maintaining competitive performance, modularity and reasonable training requirements.

Machine Translation NMT +1

Learning Latent Space Models with Angular Constraints

no code implementations ICML 2017 Pengtao Xie, Yuntian Deng, Yi Zhou, Abhimanu Kumar, Yao-Liang Yu, James Zou, Eric P. Xing

The large model capacity of latent space models (LSMs) enables them to achieve great performance on various applications, but meanwhile renders LSMs to be prone to overfitting.

Dropout with Expectation-linear Regularization

no code implementations26 Sep 2016 Xuezhe Ma, Yingkai Gao, Zhiting Hu, Yao-Liang Yu, Yuntian Deng, Eduard Hovy

Algorithmically, we show that our proposed measure of the inference gap can be used to regularize the standard dropout training objective, resulting in an \emph{explicit} control of the gap.

Image Classification

Image-to-Markup Generation with Coarse-to-Fine Attention

14 code implementations ICML 2017 Yuntian Deng, Anssi Kanervisto, Jeffrey Ling, Alexander M. Rush

We present a neural encoder-decoder model to convert images into presentational markup based on a scalable coarse-to-fine attention mechanism.

Optical Character Recognition (OCR)

Neural Machine Translation with Recurrent Attention Modeling

no code implementations EACL 2017 Zichao Yang, Zhiting Hu, Yuntian Deng, Chris Dyer, Alex Smola

Knowing which words have been attended to in previous time steps while generating a translation is a rich source of information for predicting what words will be attended to in the future.

Machine Translation Translation

Latent Variable Modeling with Diversity-Inducing Mutual Angular Regularization

no code implementations23 Dec 2015 Pengtao Xie, Yuntian Deng, Eric Xing

On two popular latent variable models --- restricted Boltzmann machine and distance metric learning, we demonstrate that MAR can effectively capture long-tail patterns, reduce model complexity without sacrificing expressivity and improve interpretability.

Metric Learning

On the Generalization Error Bounds of Neural Networks under Diversity-Inducing Mutual Angular Regularization

no code implementations23 Nov 2015 Pengtao Xie, Yuntian Deng, Eric Xing

Recently diversity-inducing regularization methods for latent variable models (LVMs), which encourage the components in LVMs to be diverse, have been studied to address several issues involved in latent variable modeling: (1) how to capture long-tail patterns underlying data; (2) how to reduce model complexity without sacrificing expressivity; (3) how to improve the interpretability of learned patterns.

Creating Scalable and Interactive Web Applications Using High Performance Latent Variable Models

no code implementations21 Oct 2015 Aaron Q. Li, Yuntian Deng, Kublai Jing, Joseph W Robinson

In this project we outline a modularized, scalable system for comparing Amazon products in an interactive and informative way using efficient latent variable models and dynamic visualization.

Cannot find the paper you are looking for? You can Submit a new open access paper.