Search Results for author: Shaohan Huang

Found 65 papers, 39 papers with code

Pseudo-Label Guided Unsupervised Domain Adaptation of Contextual Embeddings

no code implementations • EACL (AdaptNLP) 2021 • Tianyu Chen, Shaohan Huang, Furu Wei, JianXin Li

In unsupervised domain adaptation, we aim to train a model that works well on a target domain when provided with labeled source samples and unlabeled target samples.

Language Modelling Masked Language Modeling +3

Paper
Add Code

You Only Cache Once: Decoder-Decoder Architectures for Language Models

no code implementations • 8 May 2024 • Yutao Sun, Li Dong, Yi Zhu, Shaohan Huang, Wenhui Wang, Shuming Ma, Quanlu Zhang, Jianyong Wang, Furu Wei

We introduce a decoder-decoder architecture, YOCO, for large language models, which only caches key-value pairs once.

Decoder Retrieval

Paper
Add Code

Multimodal Large Language Model is a Human-Aligned Annotator for Text-to-Image Generation

no code implementations • 23 Apr 2024 • Xun Wu, Shaohan Huang, Furu Wei

Recent studies have demonstrated the exceptional potentials of leveraging human preference datasets to refine text-to-image generative models, enhancing the alignment between generated images and textual prompts.

Language Modelling Large Language Model +1

Paper
Add Code

Multi-Head Mixture-of-Experts

no code implementations • 23 Apr 2024 • Xun Wu, Shaohan Huang, Wenhui Wang, Furu Wei

These sub-tokens are then assigned to and processed by a diverse set of experts in parallel, and seamlessly reintegrated into the original token form.

Language Modelling

Paper
Add Code

Mixture of LoRA Experts

1 code implementation • 21 Apr 2024 • Xun Wu, Shaohan Huang, Furu Wei

LoRA has gained widespread acceptance in the fine-tuning of large pre-trained models to cater to a diverse array of downstream tasks, showcasing notable effectiveness and efficiency, thereby solidifying its position as one of the most prevalent fine-tuning techniques.

Paper
Code

ResLoRA: Identity Residual Mapping in Low-Rank Adaption

1 code implementation • 28 Feb 2024 • Shuhua Shi, Shaohan Huang, Minghui Song, Zhoujun Li, Zihan Zhang, Haizhen Huang, Furu Wei, Weiwei Deng, Feng Sun, Qi Zhang

As one of the most popular parameter-efficient fine-tuning (PEFT) methods, low-rank adaptation (LoRA) is commonly applied to fine-tune large language models (LLMs).

3,211

Paper
Code

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

4 code implementations • 27 Feb 2024 • Shuming Ma, Hongyu Wang, Lingxiao Ma, Lei Wang, Wenhui Wang, Shaohan Huang, Li Dong, Ruiping Wang, Jilong Xue, Furu Wei

Recent research, such as BitNet, is paving the way for a new era of 1-bit Large Language Models (LLMs).

Paper
Code

HD-Eval: Aligning Large Language Model Evaluators Through Hierarchical Criteria Decomposition

no code implementations • 24 Feb 2024 • Yuxuan Liu, Tianchi Yang, Shaohan Huang, Zihan Zhang, Haizhen Huang, Furu Wei, Weiwei Deng, Feng Sun, Qi Zhang

Large language models (LLMs) have emerged as a promising alternative to expensive human evaluations.

Language Modelling Large Language Model

Paper
Add Code

$Se^2$: Sequential Example Selection for In-Context Learning

no code implementations • 21 Feb 2024 • Haoyu Liu, Jianfeng Liu, Shaohan Huang, Yuefeng Zhan, Hao Sun, Weiwei Deng, Furu Wei, Qi Zhang

The remarkable capability of large language models (LLMs) for in-context learning (ICL) needs to be activated by demonstration examples.

In-Context Learning

Paper
Add Code

Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models

no code implementations • 20 Feb 2024 • Haoran Li, Qingxiu Dong, Zhengyang Tang, Chaojun Wang, Xingxing Zhang, Haoyang Huang, Shaohan Huang, Xiaolong Huang, Zeqiang Huang, Dongdong Zhang, Yuxian Gu, Xin Cheng, Xun Wang, Si-Qing Chen, Li Dong, Wei Lu, Zhifang Sui, Benyou Wang, Wai Lam, Furu Wei

We introduce Generalized Instruction Tuning (called GLAN), a general and scalable method for instruction tuning of Large Language Models (LLMs).

Instruction Following Logical Reasoning +1

Paper
Add Code

Text Diffusion with Reinforced Conditioning

no code implementations • 19 Feb 2024 • Yuxuan Liu, Tianchi Yang, Shaohan Huang, Zihan Zhang, Haizhen Huang, Furu Wei, Weiwei Deng, Feng Sun, Qi Zhang

Diffusion models have demonstrated exceptional capability in generating high-quality images, videos, and audio.

Paper
Add Code

Improving Domain Adaptation through Extended-Text Reading Comprehension

1 code implementation • 14 Jan 2024 • Ting Jiang, Shaohan Huang, Shengyue Luo, Zihan Zhang, Haizhen Huang, Furu Wei, Weiwei Deng, Feng Sun, Qi Zhang, Deqing Wang, Fuzhen Zhuang

To enhance the domain-specific capabilities of large language models, continued pre-training on a domain-specific corpus is a prevalent method.

Clustering Domain Adaptation +1

3,211

Paper
Code

Democratizing Reasoning Ability: Tailored Learning from Large Language Model

1 code implementation • 20 Oct 2023 • Zhaoyang Wang, Shaohan Huang, Yuxuan Liu, Jiahai Wang, Minghui Song, Zihan Zhang, Haizhen Huang, Furu Wei, Weiwei Deng, Feng Sun, Qi Zhang

In this paper, we propose a tailored learning approach to distill such reasoning ability to smaller LMs to facilitate the democratization of the exclusive reasoning ability.

Instruction Following Language Modelling +1

Paper
Code

BitNet: Scaling 1-bit Transformers for Large Language Models

2 code implementations • 17 Oct 2023 • Hongyu Wang, Shuming Ma, Li Dong, Shaohan Huang, Huaijie Wang, Lingxiao Ma, Fan Yang, Ruiping Wang, Yi Wu, Furu Wei

The increasing size of large language models has posed challenges for deployment and raised concerns about environmental impact due to high energy consumption.

Language Modelling Quantization

237

Paper
Code

Kosmos-G: Generating Images in Context with Multimodal Large Language Models

1 code implementation • 4 Oct 2023 • Xichen Pan, Li Dong, Shaohan Huang, Zhiliang Peng, Wenhu Chen, Furu Wei

These limitations keep them far from the ultimate goal of "image as a foreign language in image generation."

Decoder Image Generation

18,319

Paper
Code

Calibrating LLM-Based Evaluator

no code implementations • 23 Sep 2023 • Yuxuan Liu, Tianchi Yang, Shaohan Huang, Zihan Zhang, Haizhen Huang, Furu Wei, Weiwei Deng, Feng Sun, Qi Zhang

Recent advancements in large language models (LLMs) on language modeling and emergent capabilities make them a promising reference-free evaluator of natural language generation quality, and a competent alternative to human evaluation.

In-Context Learning Language Modelling +1

Paper
Add Code

Kosmos-2.5: A Multimodal Literate Model

no code implementations • 20 Sep 2023 • Tengchao Lv, Yupan Huang, Jingye Chen, Lei Cui, Shuming Ma, Yaoyao Chang, Shaohan Huang, Wenhui Wang, Li Dong, Weiyao Luo, Shaoxiang Wu, Guoxin Wang, Cha Zhang, Furu Wei

We present Kosmos-2. 5, a multimodal literate model for machine reading of text-intensive images.

Reading Comprehension Text Generation

Paper
Add Code

Adapting Large Language Models via Reading Comprehension

1 code implementation • 18 Sep 2023 • Daixuan Cheng, Shaohan Huang, Furu Wei

Taken inspiration from human learning via reading comprehension--practice after reading improves the ability to answer questions based on the learned knowledge--we propose a simple method for transforming raw corpora into reading comprehension texts.

Language Modelling Question Answering +1

3,211

Paper
Code

LogGPT: Exploring ChatGPT for Log-Based Anomaly Detection

no code implementations • 3 Sep 2023 • Jiaxing Qi, Shaohan Huang, Zhongzhi Luan, Carol Fung, Hailong Yang, Depei Qian

In this work, we proposed LogGPT, a log-based anomaly detection framework based on ChatGPT.

Anomaly Detection

Paper
Add Code

Scaling Sentence Embeddings with Large Language Models

1 code implementation • 31 Jul 2023 • Ting Jiang, Shaohan Huang, Zhongzhi Luan, Deqing Wang, Fuzhen Zhuang

We also fine-tune LLMs with current contrastive learning approach, and the 2. 7B OPT model, incorporating our prompt-based method, surpasses the performance of 4. 8B ST5, achieving the new state-of-the-art results on STS tasks.

Ranked #1 on Semantic Textual Similarity on STS12

Contrastive Learning In-Context Learning +4

Paper
Code

Retentive Network: A Successor to Transformer for Large Language Models

8 code implementations • 17 Jul 2023 • Yutao Sun, Li Dong, Shaohan Huang, Shuming Ma, Yuqing Xia, Jilong Xue, Jianyong Wang, Furu Wei

In this work, we propose Retentive Network (RetNet) as a foundation architecture for large language models, simultaneously achieving training parallelism, low-cost inference, and good performance.

Language Modelling

18,485

Paper
Code

LongNet: Scaling Transformers to 1,000,000,000 Tokens

3 code implementations • 5 Jul 2023 • Jiayu Ding, Shuming Ma, Li Dong, Xingxing Zhang, Shaohan Huang, Wenhui Wang, Nanning Zheng, Furu Wei

Scaling sequence length has become a critical demand in the era of large language models.

18,485

Paper
Code

Kosmos-2: Grounding Multimodal Large Language Models to the World

2 code implementations • 26 Jun 2023 • Zhiliang Peng, Wenhui Wang, Li Dong, Yaru Hao, Shaohan Huang, Shuming Ma, Furu Wei

We introduce Kosmos-2, a Multimodal Large Language Model (MLLM), enabling new capabilities of perceiving object descriptions (e. g., bounding boxes) and grounding text to the visual world.

Ranked #11 on Visual Question Answering on ViP-Bench

Image Captioning In-Context Learning +8

18,499

Paper
Code

Learning Music Sequence Representation from Text Supervision

no code implementations • 31 May 2023 • Tianyu Chen, Yuan Xie, Shuai Zhang, Shaohan Huang, Haoyi Zhou, JianXin Li

Music representation learning is notoriously difficult for its complex human-related concepts contained in the sequence of numerical signals.

Contrastive Learning Representation Learning

Paper
Add Code

Dual-Alignment Pre-training for Cross-lingual Sentence Embedding

1 code implementation • 16 May 2023 • Ziheng Li, Shaohan Huang, Zihan Zhang, Zhi-Hong Deng, Qiang Lou, Haizhen Huang, Jian Jiao, Furu Wei, Weiwei Deng, Qi Zhang

Recent studies have shown that dual encoder models trained with the sentence-level translation ranking task are effective methods for cross-lingual sentence embedding.

Language Modelling Sentence +3

Paper
Code

Pre-training Language Model as a Multi-perspective Course Learner

no code implementations • 6 May 2023 • Beiduo Chen, Shaohan Huang, Zihan Zhang, Wu Guo, ZhenHua Ling, Haizhen Huang, Furu Wei, Weiwei Deng, Qi Zhang

Besides, two self-correction courses are proposed to bridge the chasm between the two encoders by creating a "correction notebook" for secondary-supervision.

Language Modelling Masked Language Modeling

Paper
Add Code

UPRISE: Universal Prompt Retrieval for Improving Zero-Shot Evaluation

1 code implementation • 15 Mar 2023 • Daixuan Cheng, Shaohan Huang, Junyu Bi, Yuefeng Zhan, Jianfeng Liu, Yujing Wang, Hao Sun, Furu Wei, Denvy Deng, Qi Zhang

Large Language Models (LLMs) are popular for their impressive abilities, but the need for model-specific fine-tuning or task-specific prompt engineering can hinder their generalization.

Hallucination Prompt Engineering +1

3,211

Paper
Code

Language Is Not All You Need: Aligning Perception with Language Models

1 code implementation • NeurIPS 2023 • Shaohan Huang, Li Dong, Wenhui Wang, Yaru Hao, Saksham Singhal, Shuming Ma, Tengchao Lv, Lei Cui, Owais Khan Mohammed, Barun Patra, Qiang Liu, Kriti Aggarwal, Zewen Chi, Johan Bjorck, Vishrav Chaudhary, Subhojit Som, Xia Song, Furu Wei

A big convergence of language, multimodal perception, action, and world modeling is a key step toward artificial general intelligence.

Image Captioning Language Modelling +4

18,485

Paper
Code

A Length-Extrapolatable Transformer

5 code implementations • 20 Dec 2022 • Yutao Sun, Li Dong, Barun Patra, Shuming Ma, Shaohan Huang, Alon Benhaim, Vishrav Chaudhary, Xia Song, Furu Wei

Position modeling plays a critical role in Transformers.

Language Modelling Position

2,932

Paper
Code

GanLM: Encoder-Decoder Pre-training with an Auxiliary Discriminator

1 code implementation • 20 Dec 2022 • Jian Yang, Shuming Ma, Li Dong, Shaohan Huang, Haoyang Huang, Yuwei Yin, Dongdong Zhang, Liqun Yang, Furu Wei, Zhoujun Li

Inspired by the idea of Generative Adversarial Networks (GANs), we propose a GAN-style model for encoder-decoder pre-training by introducing an auxiliary discriminator, unifying the ability of language understanding and generation in a single model.

Decoder Denoising +2

Paper
Code

TorchScale: Transformers at Scale

1 code implementation • 23 Nov 2022 • Shuming Ma, Hongyu Wang, Shaohan Huang, Wenhui Wang, Zewen Chi, Li Dong, Alon Benhaim, Barun Patra, Vishrav Chaudhary, Xia Song, Furu Wei

Large Transformers have achieved state-of-the-art performance across many tasks.

Language Modelling Machine Translation +1

2,932

Paper
Code

Beyond English-Centric Bitexts for Better Multilingual Language Representation Learning

no code implementations • 26 Oct 2022 • Barun Patra, Saksham Singhal, Shaohan Huang, Zewen Chi, Li Dong, Furu Wei, Vishrav Chaudhary, Xia Song

In this paper, we elaborate upon recipes for building multilingual representation models that are not only competitive with existing state-of-the-art models but are also more parameter efficient, thereby promoting better adoption in resource-constrained scenarios and practical applications.

Representation Learning

Paper
Add Code

CROP: Zero-shot Cross-lingual Named Entity Recognition with Multilingual Labeled Sequence Translation

1 code implementation • 13 Oct 2022 • Jian Yang, Shaohan Huang, Shuming Ma, Yuwei Yin, Li Dong, Dongdong Zhang, Hongcheng Guo, Zhoujun Li, Furu Wei

Specifically, the target sequence is first translated into the source language and then tagged by a source NER model.

Cross-Lingual NER Machine Translation +5

Paper
Code

Foundation Transformers

4 code implementations • 12 Oct 2022 • Hongyu Wang, Shuming Ma, Shaohan Huang, Li Dong, Wenhui Wang, Zhiliang Peng, Yu Wu, Payal Bajaj, Saksham Singhal, Alon Benhaim, Barun Patra, Zhun Liu, Vishrav Chaudhary, Xia Song, Furu Wei

A big convergence of model architectures across language, vision, speech, and multimodal is emerging.

Language Modelling Machine Translation +1

18,485

Paper
Code

MoEC: Mixture of Expert Clusters

no code implementations • 19 Jul 2022 • Yuan Xie, Shaohan Huang, Tianyu Chen, Furu Wei

Sparsely Mixture of Experts (MoE) has received great interest due to its promising scaling capability with affordable computational overhead.

Machine Translation Natural Language Understanding

Paper
Add Code

Language Models are General-Purpose Interfaces

1 code implementation • 13 Jun 2022 • Yaru Hao, Haoyu Song, Li Dong, Shaohan Huang, Zewen Chi, Wenhui Wang, Shuming Ma, Furu Wei

Experimental results across various language-only and vision-language benchmarks show that our model outperforms or is competitive with specialized models on finetuning, zero-shot generalization, and few-shot learning.

Ranked #2 on Image Captioning on nocaps val

Causal Language Modeling Few-Shot Learning +6

18,485

Paper
Code

THE-X: Privacy-Preserving Transformer Inference with Homomorphic Encryption

no code implementations • Findings (ACL) 2022 • Tianyu Chen, Hangbo Bao, Shaohan Huang, Li Dong, Binxing Jiao, Daxin Jiang, Haoyi Zhou, JianXin Li, Furu Wei

As more and more pre-trained language models adopt on-cloud deployment, the privacy issues grow quickly, mainly for the exposure of plain-text user data (e. g., search history, medical record, bank account).

Privacy Preserving

Paper
Add Code

Task-Specific Expert Pruning for Sparse Mixture-of-Experts

no code implementations • 1 Jun 2022 • Tianyu Chen, Shaohan Huang, Yuan Xie, Binxing Jiao, Daxin Jiang, Haoyi Zhou, JianXin Li, Furu Wei

The sparse Mixture-of-Experts (MoE) model is powerful for large-scale pre-training and has achieved promising results due to its model capacity.

Paper
Add Code

On the Representation Collapse of Sparse Mixture of Experts

2 code implementations • 20 Apr 2022 • Zewen Chi, Li Dong, Shaohan Huang, Damai Dai, Shuming Ma, Barun Patra, Saksham Singhal, Payal Bajaj, Xia Song, Xian-Ling Mao, Heyan Huang, Furu Wei

We also present a comprehensive analysis on the representation and routing behaviors of our models.

Clustering Language Modelling

18,485

Paper
Code

DeepNet: Scaling Transformers to 1,000 Layers

6 code implementations • 1 Mar 2022 • Hongyu Wang, Shuming Ma, Li Dong, Shaohan Huang, Dongdong Zhang, Furu Wei

In this paper, we propose a simple yet effective method to stabilize extremely deep Transformers.

Translation

48,835

Paper
Code

Kformer: Knowledge Injection in Transformer Feed-Forward Layers

1 code implementation • 15 Jan 2022 • Yunzhi Yao, Shaohan Huang, Li Dong, Furu Wei, Huajun Chen, Ningyu Zhang

In this work, we propose a simple model, Kformer, which takes advantage of the knowledge stored in PTMs and external knowledge via knowledge injection in Transformer FFN layers.

Language Modelling Question Answering

Paper
Code

PromptBERT: Improving BERT Sentence Embeddings with Prompts

1 code implementation • 12 Jan 2022 • Ting Jiang, Jian Jiao, Shaohan Huang, Zihan Zhang, Deqing Wang, Fuzhen Zhuang, Furu Wei, Haizhen Huang, Denvy Deng, Qi Zhang

We propose PromptBERT, a novel contrastive learning method for learning better sentence representation.

Contrastive Learning Denoising +6

318

Paper
Code

Multilingual Machine Translation Systems from Microsoft for WMT21 Shared Task

no code implementations • WMT (EMNLP) 2021 • Jian Yang, Shuming Ma, Haoyang Huang, Dongdong Zhang, Li Dong, Shaohan Huang, Alexandre Muzio, Saksham Singhal, Hany Hassan Awadalla, Xia Song, Furu Wei

This report describes Microsoft's machine translation systems for the WMT21 shared task on large-scale multilingual machine translation.

Decoder Machine Translation +1

Paper
Add Code

Improving Non-autoregressive Generation with Mixup Training

1 code implementation • 21 Oct 2021 • Ting Jiang, Shaohan Huang, Zihan Zhang, Deqing Wang, Fuzhen Zhuang, Furu Wei, Haizhen Huang, Liangjie Zhang, Qi Zhang

While pre-trained language models have achieved great success on various natural language understanding tasks, how to effectively leverage them into non-autoregressive generation tasks remains a challenge.

Natural Language Understanding Paraphrase Generation +2

Paper
Code

Allocating Large Vocabulary Capacity for Cross-lingual Language Model Pre-training

2 code implementations • EMNLP 2021 • Bo Zheng, Li Dong, Shaohan Huang, Saksham Singhal, Wanxiang Che, Ting Liu, Xia Song, Furu Wei

We find that many languages are under-represented in recent cross-lingual language models due to the limited vocabulary capacity.

Language Modelling

623

Paper
Code

XLM-E: Cross-lingual Language Model Pre-training via ELECTRA

3 code implementations • ACL 2022 • Zewen Chi, Shaohan Huang, Li Dong, Shuming Ma, Bo Zheng, Saksham Singhal, Payal Bajaj, Xia Song, Xian-Ling Mao, Heyan Huang, Furu Wei

In this paper, we introduce ELECTRA-style tasks to cross-lingual language model pre-training.

Ranked #1 on Zero-Shot Cross-Lingual Transfer on XTREME

Language Modelling Translation +1

18,485

Paper
Code

Adapt-and-Distill: Developing Small, Fast and Effective Pretrained Language Models for Domains

no code implementations • Findings (ACL) 2021 • Yunzhi Yao, Shaohan Huang, Wenhui Wang, Li Dong, Furu Wei

In this paper, we present a general approach to developing small, fast and effective pre-trained models for specific domains.

Knowledge Distillation

Paper
Add Code

DeltaLM: Encoder-Decoder Pre-training for Language Generation and Translation by Augmenting Pretrained Multilingual Encoders

2 code implementations • 25 Jun 2021 • Shuming Ma, Li Dong, Shaohan Huang, Dongdong Zhang, Alexandre Muzio, Saksham Singhal, Hany Hassan Awadalla, Xia Song, Furu Wei

While pretrained encoders have achieved success in various natural language understanding (NLU) tasks, there is a gap between these pretrained encoders and natural language generation (NLG).

Abstractive Text Summarization Decoder +6

18,500

Paper
Code

Consistency Regularization for Cross-Lingual Fine-Tuning

1 code implementation • ACL 2021 • Bo Zheng, Li Dong, Shaohan Huang, Wenhui Wang, Zewen Chi, Saksham Singhal, Wanxiang Che, Ting Liu, Xia Song, Furu Wei

Fine-tuning pre-trained cross-lingual language models can transfer task-specific supervision from one language to the others.

Machine Translation Question Answering +3

Paper
Code

Improving Pretrained Cross-Lingual Language Models via Self-Labeled Word Alignment

1 code implementation • ACL 2021 • Zewen Chi, Li Dong, Bo Zheng, Shaohan Huang, Xian-Ling Mao, Heyan Huang, Furu Wei

The cross-lingual language models are typically pretrained with masked language modeling on multilingual text or parallel sentences.

Denoising Language Modelling +4

Paper
Code

MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers

2 code implementations • Findings (ACL) 2021 • Wenhui Wang, Hangbo Bao, Shaohan Huang, Li Dong, Furu Wei

We generalize deep self-attention distillation in MiniLM (Wang et al., 2020) by only using self-attention relation distillation for task-agnostic compression of pretrained Transformers.

Relation XLM-R

11,489

Paper
Code

Unsupervised Fine-tuning for Text Clustering

no code implementations • COLING 2020 • Shaohan Huang, Furu Wei, Lei Cui, Xingxing Zhang, Ming Zhou

Fine-tuning with pre-trained language models (e. g. BERT) has achieved great success in many language understanding tasks in supervised settings (e. g. text classification).

Clustering text-classification +2

Paper
Add Code

Generating Commonsense Explanation by Extracting Bridge Concepts from Reasoning Paths

no code implementations • Asian Chapter of the Association for Computational Linguistics 2020 • Haozhe Ji, Pei Ke, Shaohan Huang, Furu Wei, Minlie Huang

Commonsense explanation generation aims to empower the machine's sense-making capability by generating plausible explanations to statements against commonsense.

Explanation Generation

Paper
Add Code

Language Generation with Multi-Hop Reasoning on Commonsense Knowledge Graph

1 code implementation • EMNLP 2020 • Haozhe Ji, Pei Ke, Shaohan Huang, Furu Wei, Xiaoyan Zhu, Minlie Huang

Despite the success of generative pre-trained language models on a series of text generation tasks, they still suffer in cases where reasoning over underlying commonsense knowledge is required during generation.

Text Generation

127

Paper
Code

DocBank: A Benchmark Dataset for Document Layout Analysis

2 code implementations • COLING 2020 • Minghao Li, Yiheng Xu, Lei Cui, Shaohan Huang, Furu Wei, Zhoujun Li, Ming Zhou

DocBank is constructed using a simple yet effective way with weak supervision from the \LaTeX{} documents available on the arXiv. com.

Document Layout Analysis

525

Paper
Code

TableBank: Table Benchmark for Image-based Table Detection and Recognition

1 code implementation • LREC 2020 • Minghao Li, Lei Cui, Shaohan Huang, Furu Wei, Ming Zhou, Zhoujun Li

We present TableBank, a new image-based table detection and recognition dataset built with novel weak supervision from Word and Latex documents on the internet.

Table Detection

972

Paper
Code

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

16 code implementations • 31 Dec 2019 • Yiheng Xu, Minghao Li, Lei Cui, Shaohan Huang, Furu Wei, Ming Zhou

In this paper, we propose the \textbf{LayoutLM} to jointly model interactions between text and layout information across scanned document images, which is beneficial for a great number of real-world document image understanding tasks such as information extraction from scanned documents.

Ranked #7 on Relation Extraction on FUNSD

Document AI Document Image Classification +3

125,862

Paper
Code

TableBank: A Benchmark Dataset for Table Detection and Recognition

2 code implementations • LREC 2020 • Minghao Li, Lei Cui, Shaohan Huang, Furu Wei, Ming Zhou, Zhoujun Li

We present TableBank, a new image-based table detection and recognition dataset built with novel weak supervision from Word and Latex documents on the internet.

Table Detection

972

Paper
Code

Text Morphing

no code implementations • 30 Sep 2018 • Shaohan Huang, Yu Wu, Furu Wei, Ming Zhou

In this paper, we introduce a novel natural language generation task, termed as text morphing, which targets at generating the intermediate sentences that are fluency and smooth with the two input sentences.

Sentence Text Generation

Paper
Add Code

Neural Melody Composition from Lyrics

no code implementations • 12 Sep 2018 • Hangbo Bao, Shaohan Huang, Furu Wei, Lei Cui, Yu Wu, Chuanqi Tan, Songhao Piao, Ming Zhou

In this paper, we study a novel task that learns to compose music from natural language.

Decoder

Paper
Add Code

Neural Document Summarization by Jointly Learning to Score and Select Sentences

1 code implementation • ACL 2018 • Qingyu Zhou, Nan Yang, Furu Wei, Shaohan Huang, Ming Zhou, Tiejun Zhao

In this paper, we present a novel end-to-end neural network framework for extractive document summarization by jointly learning to score and select sentences.

Ranked #9 on Extractive Text Summarization on CNN / Daily Mail

Document Summarization Extractive Document Summarization +3

149

Paper
Code

Dictionary-Guided Editing Networks for Paraphrase Generation

no code implementations • 21 Jun 2018 • Shaohan Huang, Yu Wu, Furu Wei, Ming Zhou

An intuitive way for a human to write paraphrase sentences is to replace words or phrases in the original sentence with their corresponding synonyms and make necessary changes to ensure the new sentences are fluent and grammatically correct.

Paraphrase Generation Sentence

Paper
Add Code

Response Generation by Context-aware Prototype Editing

3 code implementations • 19 Jun 2018 • Yu Wu, Furu Wei, Shaohan Huang, Yunli Wang, Zhoujun Li, Ming Zhou

Open domain response generation has achieved remarkable progress in recent years, but sometimes yields short and uninformative responses.

Decoder Informativeness +2

Paper
Code

SuperAgent: A Customer Service Chatbot for E-commerce Websites

no code implementations • ACL 2017 • Lei Cui, Shaohan Huang, Furu Wei, Chuanqi Tan, Chaoqun Duan, Ming Zhou

Chatbot Opinion Mining +1

Paper
Add Code

Learning to Generate Product Reviews from Attributes

no code implementations • EACL 2017 • Li Dong, Shaohan Huang, Furu Wei, Mirella Lapata, Ming Zhou, Ke Xu

This paper presents an attention-enhanced attribute-to-sequence model to generate product reviews for given attribute information, such as user, product, and rating.

Attribute Decoder +3

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.