BART is a denoising autoencoder for pretraining sequence-to-sequence models. It is trained by (1) corrupting text with an arbitrary noising function, and (2) learning a model to reconstruct the original text. It uses a standard Transformer-based neural machine translation architecture. It uses a standard seq2seq/NMT architecture with a bidirectional encoder (like BERT) and a left-to-right decoder (like GPT). This means the encoder's attention mask is fully visible, like BERT, and the decoder's attention mask is causal, like GPT2.

Source: BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

Latest Papers

PAPER DATE
Topic-Aware Abstractive Text Summarization
| Chujie ZhengKunpeng ZhangHarry Jiannan WangLing Fan
2020-10-20
Dimsum @LaySumm 20: BART-based Approach for Scientific Document Summarization
Tiezheng YuDan SuWenliang DaiPascale Fung
2020-10-19
Understanding Neural Abstractive Summarization Models via Uncertainty
| Jiacheng XuShrey DesaiGreg Durrett
2020-10-15
What Have We Achieved on Text Summarization?
Dandan HuangLeyang CuiSen yangGuangsheng BaoKun WangJun XieYue Zhang
2020-10-09
Low-Resource Domain Adaptation for Compositional Task-Oriented Semantic Parsing
Xilun ChenAsish GhoshalYashar MehdadLuke ZettlemoyerSonal Gupta
2020-10-07
Incorporating Behavioral Hypotheses for Query Generation
Ruey-Cheng ChenChia-Jung Lee
2020-10-06
Beyond [CLS] through Ranking by Generation
Cicero Nogueira dos santosXiaofei MaRamesh NallapatiZhiheng HuangBing Xiang
2020-10-06
PAIR: Planning and Iterative Refinement in Pre-trained Transformers for Long Text Generation
Xinyu HuaLu Wang
2020-10-05
STIL -- Simultaneous Slot Filling, Translation, Intent Classification, and Language Identification: Initial Results using mBART on MultiATIS++
Jack G. M. FitzGerald
2020-10-02
KG-BART: Knowledge Graph-Augmented BART for Generative Commonsense Reasoning
| Ye LiuYao WanLifang HeHao PengPhilip S. Yu
2020-09-26
MinTL: Minimalist Transfer Learning for Task-Oriented Dialogue Systems
Zhaojiang LinAndrea MadottoGenta Indra WinataPascale Fung
2020-09-25
Estimation of causal effects of multiple treatments in healthcare database studies with rare outcomes
Liangyuan HuChenyang Gu
2020-08-18
Question and Answer Test-Train Overlap in Open-Domain Question Answering Datasets
| Patrick LewisPontus StenetorpSebastian Riedel
2020-08-06
Investigating Pretrained Language Models for Graph-to-Text Generation
Leonardo F. R. RibeiroMartin SchmittHinrich SchützeIryna Gurevych
2020-07-16
POSTECH Submission on Duolingo Shared Task
Junsu ParkHongseok KwonJong-Hyeok Lee
2020-07-01
Stronger Baselines for Grammatical Error Correction Using Pretrained Encoder-Decoder Model
Satoru KatsumataMamoru Komachi
2020-05-24
Recipes for Adapting Pre-trained Monolingual and Multilingual Models to Machine Translation
Asa Cooper SticklandXian LiMarjan Ghazvininejad
2020-04-30
PALM: Pre-training an Autoencoding&Autoregressive Language Model for Context-conditioned Generation
Bin BiChenliang LiChen WuMing YanWei WangSongfang HuangFei HuangLuo Si
2020-04-14
Multilingual Denoising Pre-training for Neural Machine Translation
| Yinhan LiuJiatao GuNaman GoyalXian LiSergey EdunovMarjan GhazvininejadMike LewisLuke Zettlemoyer
2020-01-22
Make Lead Bias in Your Favor: Zero-shot Abstractive News Summarization
Chenguang ZhuZiyi YangRobert GmyrMichael ZengXuedong Huang
2019-12-25
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
| Mike LewisYinhan LiuNaman GoyalMarjan GhazvininejadAbdelrahman MohamedOmer LevyVes StoyanovLuke Zettlemoyer
2019-10-29

Categories