Search Results for author: Jiatao Gu

Found 74 papers, 31 papers with code

Non-autoregressive Translation with Disentangled Context Transformer

1 code implementation ICML 2020 Jungo Kasai, James Cross, Marjan Ghazvininejad, Jiatao Gu

State-of-the-art neural machine translation models generate a translation from left to right and every step is conditioned on the previously generated tokens.

Machine Translation Sentence +1

Facebook AI’s WMT20 News Translation Task Submission

no code implementations WMT (EMNLP) 2020 Peng-Jen Chen, Ann Lee, Changhan Wang, Naman Goyal, Angela Fan, Mary Williamson, Jiatao Gu

We approach the low resource problem using two main strategies, leveraging all available data and adapting the system to the target news domain.

Data Augmentation Translation

Non-Autoregressive Sequence Generation

no code implementations ACL 2022 Jiatao Gu, Xu Tan

Non-autoregressive sequence generation (NAR) attempts to generate the entire or partial output sequences in parallel to speed up the generation process and avoid potential issues (e. g., label bias, exposure bias) in autoregressive generation.

How Far Are We from Intelligent Visual Deductive Reasoning?

1 code implementation7 Mar 2024 Yizhe Zhang, He Bai, Ruixiang Zhang, Jiatao Gu, Shuangfei Zhai, Josh Susskind, Navdeep Jaitly

Vision-Language Models (VLMs) such as GPT-4V have recently demonstrated incredible strides on diverse vision language tasks.

In-Context Learning Visual Reasoning

Divide-or-Conquer? Which Part Should You Distill Your LLM?

no code implementations22 Feb 2024 Zhuofeng Wu, He Bai, Aonan Zhang, Jiatao Gu, VG Vinod Vydiswaran, Navdeep Jaitly, Yizhe Zhang

Recent methods have demonstrated that Large Language Models (LLMs) can solve reasoning tasks better when they are encouraged to solve subtasks of the main task first.

Problem Decomposition

Efficient-NeRF2NeRF: Streamlining Text-Driven 3D Editing with Multiview Correspondence-Enhanced Diffusion Models

no code implementations13 Dec 2023 Liangchen Song, Liangliang Cao, Jiatao Gu, Yifan Jiang, Junsong Yuan, Hao Tang

In this work, we propose that by incorporating correspondence regularization into diffusion models, the process of 3D editing can be significantly accelerated.

Diffusion Models Without Attention

no code implementations30 Nov 2023 Jing Nathan Yan, Jiatao Gu, Alexander M. Rush

In recent advancements in high-fidelity image generation, Denoising Diffusion Probabilistic Models (DDPMs) have emerged as a key player.

Denoising Image Generation

Matryoshka Diffusion Models

no code implementations23 Oct 2023 Jiatao Gu, Shuangfei Zhai, Yizhe Zhang, Josh Susskind, Navdeep Jaitly

Diffusion models are the de facto approach for generating high-quality images and videos, but learning high-dimensional models remains a formidable task due to computational and optimization challenges.

Image Generation Zero-shot Generalization

Adaptivity and Modularity for Efficient Generalization Over Task Complexity

no code implementations13 Oct 2023 Samira Abnar, Omid Saremi, Laurent Dinh, Shantel Wilson, Miguel Angel Bautista, Chen Huang, Vimal Thilak, Etai Littwin, Jiatao Gu, Josh Susskind, Samy Bengio

We investigate how the use of a mechanism for adaptive and modular computation in transformers facilitates the learning of tasks that demand generalization over the number of sequential computation steps (i. e., the depth of the computation graph).

Retrieval

Generative Modeling with Phase Stochastic Bridges

no code implementations11 Oct 2023 Tianrong Chen, Jiatao Gu, Laurent Dinh, Evangelos A. Theodorou, Josh Susskind, Shuangfei Zhai

In this work, we introduce a novel generative modeling framework grounded in \textbf{phase space dynamics}, where a phase space is defined as {an augmented space encompassing both position and velocity.}

Image Generation Position

PLANNER: Generating Diversified Paragraph via Latent Language Diffusion Model

1 code implementation NeurIPS 2023 Yizhe Zhang, Jiatao Gu, Zhuofeng Wu, Shuangfei Zhai, Josh Susskind, Navdeep Jaitly

Autoregressive models for text sometimes generate repetitive and low-quality output because errors accumulate during the steps of generation.

Denoising

Control3Diff: Learning Controllable 3D Diffusion Models from Single-view Images

no code implementations13 Apr 2023 Jiatao Gu, Qingzhe Gao, Shuangfei Zhai, Baoquan Chen, Lingjie Liu, Josh Susskind

To address these challenges, We present Control3Diff, a 3D diffusion model that combines the strengths of diffusion models and 3D GANs for versatile, controllable 3D-aware image synthesis for single-view datasets.

3D-Aware Image Synthesis

Stabilizing Transformer Training by Preventing Attention Entropy Collapse

1 code implementation11 Mar 2023 Shuangfei Zhai, Tatiana Likhomanenko, Etai Littwin, Dan Busbridge, Jason Ramapuram, Yizhe Zhang, Jiatao Gu, Josh Susskind

We show that $\sigma$Reparam provides stability and robustness with respect to the choice of hyperparameters, going so far as enabling training (a) a Vision Transformer {to competitive performance} without warmup, weight decay, layer normalization or adaptive optimizers; (b) deep architectures in machine translation and (c) speech recognition to competitive performance without warmup and adaptive optimizers.

Automatic Speech Recognition Image Classification +6

MAST: Masked Augmentation Subspace Training for Generalizable Self-Supervised Priors

no code implementations7 Mar 2023 Chen Huang, Hanlin Goh, Jiatao Gu, Josh Susskind

We do so by Masked Augmentation Subspace Training (or MAST) to encode in the single feature space the priors from different data augmentations in a factorized way.

Instance Segmentation Self-Supervised Learning +1

Diffusion Probabilistic Fields

no code implementations1 Mar 2023 Peiye Zhuang, Samira Abnar, Jiatao Gu, Alex Schwing, Joshua M. Susskind, Miguel Ángel Bautista

Diffusion probabilistic models have quickly become a major approach for generative modeling of images, 3D geometry, video and other domains.

Denoising

NerfDiff: Single-image View Synthesis with NeRF-guided Distillation from 3D-aware Diffusion

no code implementations20 Feb 2023 Jiatao Gu, Alex Trevithick, Kai-En Lin, Josh Susskind, Christian Theobalt, Lingjie Liu, Ravi Ramamoorthi

Novel view synthesis from a single image requires inferring occluded regions of objects and scenes whilst simultaneously maintaining semantic and physical consistency with the input.

Novel View Synthesis

f-DM: A Multi-stage Diffusion Model via Progressive Signal Transformation

no code implementations10 Oct 2022 Jiatao Gu, Shuangfei Zhai, Yizhe Zhang, Miguel Angel Bautista, Josh Susskind

In this work, we propose f-DM, a generalized family of DMs which allows progressive signal transformation.

Image Generation

Progressively-connected Light Field Network for Efficient View Synthesis

no code implementations10 Jul 2022 Peng Wang, YuAn Liu, Guying Lin, Jiatao Gu, Lingjie Liu, Taku Komura, Wenping Wang

ProLiF encodes a 4D light field, which allows rendering a large batch of rays in one training step for image- or patch-level losses.

Novel View Synthesis

Multilingual Neural Machine Translation with Deep Encoder and Multiple Shallow Decoders

no code implementations EACL 2021 Xiang Kong, Adithya Renduchintala, James Cross, Yuqing Tang, Jiatao Gu, Xian Li

Recent work in multilingual translation advances translation quality surpassing bilingual baselines using deep transformer models with increased capacity.

Machine Translation Translation

Detection, Disambiguation, Re-ranking: Autoregressive Entity Linking as a Multi-Task Problem

no code implementations Findings (ACL) 2022 Khalil Mrini, Shaoliang Nie, Jiatao Gu, Sinong Wang, Maziar Sanjabi, Hamed Firooz

Without the use of a knowledge base or candidate sets, our model sets a new state of the art in two benchmark datasets of entity linking: COMETA in the biomedical domain, and AIDA-CoNLL in the news domain.

Entity Linking Re-Ranking

IDPG: An Instance-Dependent Prompt Generation Method

no code implementations NAACL 2022 Zhuofeng Wu, Sinong Wang, Jiatao Gu, Rui Hou, Yuxiao Dong, V. G. Vinod Vydiswaran, Hao Ma

Prompt tuning is a new, efficient NLP transfer learning paradigm that adds a task-specific prompt in each input instance during the model training stage.

Language Modelling Natural Language Understanding +2

Enhanced Direct Speech-to-Speech Translation Using Self-supervised Pre-training and Data Augmentation

no code implementations6 Apr 2022 Sravya Popuri, Peng-Jen Chen, Changhan Wang, Juan Pino, Yossi Adi, Jiatao Gu, Wei-Ning Hsu, Ann Lee

Direct speech-to-speech translation (S2ST) models suffer from data scarcity issues as there exists little parallel S2ST data, compared to the amount of data available for conventional cascaded systems that consist of automatic speech recognition (ASR), machine translation (MT), and text-to-speech (TTS) synthesis.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +6

data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language

9 code implementations Preprint 2022 Alexei Baevski, Wei-Ning Hsu, Qiantong Xu, Arun Babu, Jiatao Gu, Michael Auli

While the general idea of self-supervised learning is identical across modalities, the actual algorithms and objectives differ widely because they were developed with a single modality in mind.

Image Classification Linguistic Acceptability +5

StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image Synthesis

1 code implementation ICLR 2022 Jiatao Gu, Lingjie Liu, Peng Wang, Christian Theobalt

We perform volume rendering only to produce a low-resolution feature map and progressively apply upsampling in 2D to address the first issue.

Image Generation

Direct speech-to-speech translation with discrete units

1 code implementation ACL 2022 Ann Lee, Peng-Jen Chen, Changhan Wang, Jiatao Gu, Sravya Popuri, Xutai Ma, Adam Polyak, Yossi Adi, Qing He, Yun Tang, Juan Pino, Wei-Ning Hsu

When target text transcripts are available, we design a joint speech and text training framework that enables the model to generate dual modality output (speech and text) simultaneously in the same inference pass.

Speech-to-Speech Translation Text Generation +1

Volume Rendering of Neural Implicit Surfaces

3 code implementations NeurIPS 2021 Lior Yariv, Jiatao Gu, Yoni Kasten, Yaron Lipman

Accurate sampling is important to provide a precise coupling of geometry and radiance; and (iii) it allows efficient unsupervised disentanglement of shape and appearance in volume rendering.

Disentanglement Inductive Bias

Neural Actor: Neural Free-view Synthesis of Human Actors with Pose Control

no code implementations3 Jun 2021 Lingjie Liu, Marc Habermann, Viktor Rudnev, Kripasindhu Sarkar, Jiatao Gu, Christian Theobalt

To address this problem, we utilize a coarse body model as the proxy to unwarp the surrounding 3D space into a canonical pose.

Fully Non-autoregressive Neural Machine Translation: Tricks of the Trade

1 code implementation Findings (ACL) 2021 Jiatao Gu, Xiang Kong

Fully non-autoregressive neural machine translation (NAT) is proposed to simultaneously predict tokens with single forward of neural networks, which significantly reduces the inference latency at the expense of quality drop compared to the Transformer baseline.

Machine Translation Translation

Facebook AI's WMT20 News Translation Task Submission

no code implementations16 Nov 2020 Peng-Jen Chen, Ann Lee, Changhan Wang, Naman Goyal, Angela Fan, Mary Williamson, Jiatao Gu

We approach the low resource problem using two main strategies, leveraging all available data and adapting the system to the target news domain.

Data Augmentation Translation

Detecting Hallucinated Content in Conditional Neural Sequence Generation

2 code implementations Findings (ACL) 2021 Chunting Zhou, Graham Neubig, Jiatao Gu, Mona Diab, Paco Guzman, Luke Zettlemoyer, Marjan Ghazvininejad

Neural sequence models can generate highly fluent sentences, but recent studies have also shown that they are also prone to hallucinate additional content not supported by the input.

Abstractive Text Summarization Hallucination +1

Dual-decoder Transformer for Joint Automatic Speech Recognition and Multilingual Speech Translation

1 code implementation COLING 2020 Hang Le, Juan Pino, Changhan Wang, Jiatao Gu, Didier Schwab, Laurent Besacier

We propose two variants of these architectures corresponding to two different levels of dependencies between the decoders, called the parallel and cross dual-decoder Transformers, respectively.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Multilingual Translation with Extensible Multilingual Pretraining and Finetuning

5 code implementations2 Aug 2020 Yuqing Tang, Chau Tran, Xi-An Li, Peng-Jen Chen, Naman Goyal, Vishrav Chaudhary, Jiatao Gu, Angela Fan

Recent work demonstrates the potential of multilingual pretraining of creating one model that can be used for various tasks in different languages.

Machine Translation Translation

Neural Sparse Voxel Fields

1 code implementation NeurIPS 2020 Lingjie Liu, Jiatao Gu, Kyaw Zaw Lin, Tat-Seng Chua, Christian Theobalt

We also demonstrate several challenging tasks, including multi-scene learning, free-viewpoint rendering of a moving human, and large-scale scene rendering.

PointContrast: Unsupervised Pre-training for 3D Point Cloud Understanding

2 code implementations ECCV 2020 Saining Xie, Jiatao Gu, Demi Guo, Charles R. Qi, Leonidas J. Guibas, Or Litany

To this end, we select a suite of diverse datasets and tasks to measure the effect of unsupervised pre-training on a large source set of 3D scenes.

Point Cloud Pre-training Representation Learning +3

Addressing Posterior Collapse with Mutual Information for Improved Variational Neural Machine Translation

no code implementations ACL 2020 Arya D. McCarthy, Xi-An Li, Jiatao Gu, Ning Dong

This paper proposes a simple and effective approach to address the problem of posterior collapse in conditional variational autoencoders (CVAEs).

Machine Translation NMT +1

FINDINGS OF THE IWSLT 2020 EVALUATION CAMPAIGN

no code implementations WS 2020 Ebrahim Ansari, Amittai Axelrod, Nguyen Bach, Ond{\v{r}}ej Bojar, Roldano Cattoni, Fahim Dalvi, Nadir Durrani, Marcello Federico, Christian Federmann, Jiatao Gu, Fei Huang, Kevin Knight, Xutai Ma, Ajay Nagesh, Matteo Negri, Jan Niehues, Juan Pino, Elizabeth Salesky, Xing Shi, Sebastian St{\"u}ker, Marco Turchi, Alex Waibel, er, Changhan Wang

The evaluation campaign of the International Conference on Spoken Language Translation (IWSLT 2020) featured this year six challenge tracks: (i) Simultaneous speech translation, (ii) Video speech translation, (iii) Offline speech translation, (iv) Conversational speech translation, (v) Open domain translation, and (vi) Non-native speech translation.

Translation

Self-Supervised Representations Improve End-to-End Speech Translation

no code implementations22 Jun 2020 Anne Wu, Changhan Wang, Juan Pino, Jiatao Gu

End-to-end speech-to-text translation can provide a simpler and smaller system but is facing the challenge of data scarcity.

Cross-Lingual Transfer speech-recognition +3

Cross-lingual Retrieval for Iterative Self-Supervised Training

1 code implementation NeurIPS 2020 Chau Tran, Yuqing Tang, Xi-An Li, Jiatao Gu

Recent studies have demonstrated the cross-lingual alignment ability of multilingual pretrained language models.

Retrieval Sentence +2

CoVoST: A Diverse Multilingual Speech-To-Text Translation Corpus

1 code implementation LREC 2020 Changhan Wang, Juan Pino, Anne Wu, Jiatao Gu

Spoken language translation has recently witnessed a resurgence in popularity, thanks to the development of end-to-end models and the creation of new corpora, such as Augmented LibriSpeech and MuST-C.

Speech-to-Text Translation Translation

Multilingual Denoising Pre-training for Neural Machine Translation

5 code implementations22 Jan 2020 Yinhan Liu, Jiatao Gu, Naman Goyal, Xi-An Li, Sergey Edunov, Marjan Ghazvininejad, Mike Lewis, Luke Zettlemoyer

This paper demonstrates that multilingual denoising pre-training produces significant performance gains across a wide variety of machine translation (MT) tasks.

Denoising Sentence +2

Non-Autoregressive Machine Translation with Disentangled Context Transformer

1 code implementation15 Jan 2020 Jungo Kasai, James Cross, Marjan Ghazvininejad, Jiatao Gu

State-of-the-art neural machine translation models generate a translation from left to right and every step is conditioned on the previously generated tokens.

Machine Translation Sentence +1

Understanding Knowledge Distillation in Non-autoregressive Machine Translation

no code implementations ICLR 2020 Chunting Zhou, Graham Neubig, Jiatao Gu

We find that knowledge distillation can reduce the complexity of data sets and help NAT to model the variations in the output data.

Knowledge Distillation Machine Translation +1

Depth-Adaptive Transformer

no code implementations ICLR 2020 Maha Elbayad, Jiatao Gu, Edouard Grave, Michael Auli

State of the art sequence-to-sequence models for large scale tasks perform a fixed number of computations for each input sequence regardless of whether it is easy or hard to process.

Machine Translation Translation

Revisiting Self-Training for Neural Sequence Generation

1 code implementation ICLR 2020 Junxian He, Jiatao Gu, Jiajun Shen, Marc'Aurelio Ranzato

In this work, we first empirically show that self-training is able to decently improve the supervised baseline on neural sequence generation tasks.

Machine Translation Text Summarization +1

The Source-Target Domain Mismatch Problem in Machine Translation

no code implementations EACL 2021 Jiajun Shen, Peng-Jen Chen, Matt Le, Junxian He, Jiatao Gu, Myle Ott, Michael Auli, Marc'Aurelio Ranzato

While we live in an increasingly interconnected world, different places still exhibit strikingly different cultures and many events we experience in our every day life pertain only to the specific place we live in.

Machine Translation Translation

Monotonic Multihead Attention

3 code implementations ICLR 2020 Xutai Ma, Juan Pino, James Cross, Liezl Puzon, Jiatao Gu

Simultaneous machine translation models start generating a target sequence before they have encoded or read the source sequence.

Machine Translation Translation

Improved Variational Neural Machine Translation by Promoting Mutual Information

no code implementations19 Sep 2019 Arya D. McCarthy, Xi-An Li, Jiatao Gu, Ning Dong

Posterior collapse plagues VAEs for text, especially for conditional text generation with strong autoregressive decoders.

Conditional Text Generation Machine Translation +1

VizSeq: A Visual Analysis Toolkit for Text Generation Tasks

1 code implementation IJCNLP 2019 Changhan Wang, Anirudh Jain, Danlu Chen, Jiatao Gu

Automatic evaluation of text generation tasks (e. g. machine translation, text summarization, image captioning and video description) usually relies heavily on task-specific metrics, such as BLEU and ROUGE.

Benchmarking Image Captioning +5

Neural Machine Translation with Byte-Level Subwords

1 code implementation7 Sep 2019 Changhan Wang, Kyunghyun Cho, Jiatao Gu

Representing text at the level of bytes and using the 256 byte set as vocabulary is a potential solution to this issue.

Machine Translation Translation

Improved Zero-shot Neural Machine Translation via Ignoring Spurious Correlations

no code implementations ACL 2019 Jiatao Gu, Yong Wang, Kyunghyun Cho, Victor O. K. Li

Zero-shot translation, translating between language pairs on which a Neural Machine Translation (NMT) system has never been trained, is an emergent property when training the system in multilingual settings.

Machine Translation NMT +1

Levenshtein Transformer

3 code implementations NeurIPS 2019 Jiatao Gu, Changhan Wang, Jake Zhao

We further confirm the flexibility of our model by showing a Levenshtein Transformer trained by machine translation can straightforwardly be used for automatic post-editing.

Automatic Post-Editing Text Summarization +1

Insertion-based Decoding with automatically Inferred Generation Order

no code implementations TACL 2019 Jiatao Gu, Qi Liu, Kyunghyun Cho

Conventional neural autoregressive decoding commonly assumes a fixed left-to-right generation order, which may be sub-optimal.

Code Generation Machine Translation +1

Meta-Learning for Low-Resource Neural Machine Translation

no code implementations EMNLP 2018 Jiatao Gu, Yong Wang, Yun Chen, Kyunghyun Cho, Victor O. K. Li

We frame low-resource translation as a meta-learning problem, and we learn to adapt to low-resource languages based on multilingual high-resource language tasks.

Low-Resource Neural Machine Translation Meta-Learning +3

Universal Neural Machine Translation for Extremely Low Resource Languages

no code implementations NAACL 2018 Jiatao Gu, Hany Hassan, Jacob Devlin, Victor O. K. Li

Our proposed approach utilizes a transfer-learning approach to share lexical and sentence level representations across multiple source languages into one target language.

Machine Translation Sentence +2

Neural Machine Translation with Gumbel-Greedy Decoding

no code implementations22 Jun 2017 Jiatao Gu, Daniel Jiwoong Im, Victor O. K. Li

Previous neural machine translation models used some heuristic search algorithms (e. g., beam search) in order to avoid solving the maximum a posteriori problem over translation sentences at test time.

Machine Translation Translation

Search Engine Guided Non-Parametric Neural Machine Translation

no code implementations20 May 2017 Jiatao Gu, Yong Wang, Kyunghyun Cho, Victor O. K. Li

In this paper, we extend an attention-based neural machine translation (NMT) model by allowing it to access an entire training set of parallel sentence pairs even after training.

Machine Translation NMT +3

Trainable Greedy Decoding for Neural Machine Translation

1 code implementation EMNLP 2017 Jiatao Gu, Kyunghyun Cho, Victor O. K. Li

Instead of trying to build a new decoding algorithm for any specific decoding objective, we propose the idea of trainable decoding algorithm in which we train a decoding algorithm to find a translation that maximizes an arbitrary decoding objective.

Machine Translation Translation

Incorporating Copying Mechanism in Sequence-to-Sequence Learning

7 code implementations ACL 2016 Jiatao Gu, Zhengdong Lu, Hang Li, Victor O. K. Li

CopyNet can nicely integrate the regular way of word generation in the decoder with the new copying mechanism which can choose sub-sequences in the input sequence and put them at proper places in the output sequence.

Text Summarization

Efficient Learning for Undirected Topic Models

no code implementations IJCNLP 2015 Jiatao Gu, Victor O. K. Li

Replicated Softmax model, a well-known undirected topic model, is powerful in extracting semantic representations of documents.

General Classification Retrieval +1

Cannot find the paper you are looking for? You can Submit a new open access paper.