Search Results for author: Yaobo Liang

Found 26 papers, 13 papers with code

XGLUE: A New Benchmark Datasetfor Cross-lingual Pre-training, Understanding and Generation

no code implementations • EMNLP 2020 • Yaobo Liang, Nan Duan, Yeyun Gong, Ning Wu, Fenfei Guo, Weizhen Qi, Ming Gong, Linjun Shou, Daxin Jiang, Guihong Cao, Xiaodong Fan, Ruofei Zhang, Rahul Agrawal, Edward Cui, Sining Wei, Taroon Bharti, Ying Qiao, Jiun-Hung Chen, Winnie Wu, Shuguang Liu, Fan Yang, Daniel Campos, Rangan Majumder, Ming Zhou

In this paper, we introduce XGLUE, a new benchmark dataset to train large-scale cross-lingual pre-trained models using multilingual and bilingual corpora, and evaluate their performance across a diverse set of cross-lingual tasks.

Natural Language Understanding XLM-R

Paper
Add Code

PPTC-R benchmark: Towards Evaluating the Robustness of Large Language Models for PowerPoint Task Completion

1 code implementation • 6 Mar 2024 • Zekai Zhang, Yiduo Guo, Yaobo Liang, Dongyan Zhao, Nan Duan

The growing dependence on Large Language Models (LLMs) for finishing user instructions necessitates a comprehensive understanding of their robustness to complex task completion in real-world situations.

Sentence

Paper
Code

Competition-Level Problems are Effective LLM Evaluators

no code implementations • 4 Dec 2023 • Yiming Huang, Zhenghao Lin, Xiao Liu, Yeyun Gong, Shuai Lu, Fangyu Lei, Yaobo Liang, Yelong Shen, Chen Lin, Nan Duan, Weizhu Chen

Large language models (LLMs) have demonstrated impressive reasoning capabilities, yet there is ongoing debate about these abilities and the potential data contamination problem recently.

Paper
Add Code

PPTC Benchmark: Evaluating Large Language Models for PowerPoint Task Completion

1 code implementation • 3 Nov 2023 • Yiduo Guo, Zekai Zhang, Yaobo Liang, Dongyan Zhao, Nan Duan

Recent evaluations of Large Language Models (LLMs) have centered around testing their zero-shot/few-shot capabilities for basic natural language tasks and their ability to translate instructions into tool APIs.

Paper
Code

EIPE-text: Evaluation-Guided Iterative Plan Extraction for Long-Form Narrative Text Generation

no code implementations • 12 Oct 2023 • Wang You, Wenshan Wu, Yaobo Liang, Shaoguang Mao, Chenfei Wu, Maosong Cao, Yuzhe Cai, Yiduo Guo, Yan Xia, Furu Wei, Nan Duan

In this paper, we propose a new framework called Evaluation-guided Iterative Plan Extraction for long-form narrative text generation (EIPE-text), which extracts plans from the corpus of narratives and utilizes the extracted plans to construct a better planner.

In-Context Learning Text Generation

Paper
Add Code

GameEval: Evaluating LLMs on Conversational Games

1 code implementation • 19 Aug 2023 • Dan Qiao, Chenfei Wu, Yaobo Liang, Juntao Li, Nan Duan

In this paper, we propose GameEval, a novel approach to evaluating LLMs through goal-driven conversational games, overcoming the limitations of previous methods.

Question Answering

Paper
Code

Machine-Created Universal Language for Cross-lingual Transfer

1 code implementation • 22 May 2023 • Yaobo Liang, Quanzhi Zhu, Junhe Zhao, Nan Duan

There are two primary approaches to addressing cross-lingual transfer: multilingual pre-training, which implicitly aligns the hidden representations of various languages, and translate-test, which explicitly translates different languages into an intermediate language, such as English.

Cross-Lingual Transfer

Paper
Code

Analyzing and Reducing the Performance Gap in Cross-Lingual Transfer with Fine-tuning Slow and Fast

no code implementations • 19 May 2023 • Yiduo Guo, Yaobo Liang, Dongyan Zhao, Bing Liu, Duan Nan

Existing research has shown that a multilingual pre-trained language model fine-tuned with one (source) language also performs well on downstream tasks for non-source languages, even though no fine-tuning is done on these languages.

Cross-Lingual Transfer Language Modelling

Paper
Add Code

Learning to Plan with Natural Language

1 code implementation • 20 Apr 2023 • Yiduo Guo, Yaobo Liang, Chenfei Wu, Wenshan Wu, Dongyan Zhao, Nan Duan

To obtain it, we propose the Learning to Plan method, which involves two phases: (1) In the first learning task plan phase, it iteratively updates the task plan with new step-by-step solutions and behavioral instructions, which are obtained by prompting LLMs to derive from training error feedback.

Transfer Learning

Paper
Code

Low-code LLM: Graphical User Interface over Large Language Models

2 code implementations • 17 Apr 2023 • Yuzhe Cai, Shaoguang Mao, Wenshan Wu, Zehua Wang, Yaobo Liang, Tao Ge, Chenfei Wu, Wang You, Ting Song, Yan Xia, Jonathan Tien, Nan Duan, Furu Wei

By introducing this framework, we aim to bridge the gap between humans and LLMs, enabling more effective and efficient utilization of LLMs for complex tasks.

Prompt Engineering

34,521

Paper
Code

AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models

2 code implementations • 13 Apr 2023 • Wanjun Zhong, Ruixiang Cui, Yiduo Guo, Yaobo Liang, Shuai Lu, Yanlin Wang, Amin Saied, Weizhu Chen, Nan Duan

Impressively, GPT-4 surpasses average human performance on SAT, LSAT, and math competitions, attaining a 95% accuracy rate on the SAT Math test and a 92. 5% accuracy on the English test of the Chinese national college entrance exam.

Decision Making Math

648

Paper
Code

TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs

no code implementations • 29 Mar 2023 • Yaobo Liang, Chenfei Wu, Ting Song, Wenshan Wu, Yan Xia, Yu Liu, Yang Ou, Shuai Lu, Lei Ji, Shaoguang Mao, Yun Wang, Linjun Shou, Ming Gong, Nan Duan

On the other hand, there are also many existing models and systems (symbolic-based or neural-based) that can do some domain-specific tasks very well.

Code Generation Common Sense Reasoning +1

Paper
Add Code

Modeling Sequential Sentence Relation to Improve Cross-lingual Dense Retrieval

1 code implementation • 3 Feb 2023 • Shunyu Zhang, Yaobo Liang, Ming Gong, Daxin Jiang, Nan Duan

Specifically, we propose a multilingual PLM called masked sentence model (MSM), which consists of a sentence encoder to generate the sentence representations, and a document encoder applied to a sequence of sentence vectors from a document.

Relation Representation Learning +3

Paper
Code

Unsupervised Context Aware Sentence Representation Pretraining for Multi-lingual Dense Retrieval

1 code implementation • 7 Jun 2022 • Ning Wu, Yaobo Liang, Houxing Ren, Linjun Shou, Nan Duan, Ming Gong, Daxin Jiang

On the multilingual sentence retrieval task Tatoeba, our model achieves new SOTA results among methods without using bilingual data.

Language Modelling Passage Retrieval +4

Paper
Code

Multi-View Document Representation Learning for Open-Domain Dense Retrieval

no code implementations • ACL 2022 • Shunyu Zhang, Yaobo Liang, Ming Gong, Daxin Jiang, Nan Duan

Second, to prevent multi-view embeddings from collapsing to the same one, we further propose a global-local loss with annealed temperature to encourage the multiple viewers to better align with different potential queries.

Representation Learning Retrieval

Paper
Add Code

Cross-Lingual Ability of Multilingual Masked Language Models: A Study of Language Structure

no code implementations • ACL 2022 • Yuan Chai, Yaobo Liang, Nan Duan

Our main conclusion is that the contribution of constituent order and word co-occurrence is limited, while the composition is more crucial to the success of cross-linguistic transfer.

Natural Language Inference Retrieval +2

Paper
Add Code

XLM-K: Improving Cross-Lingual Language Model Pre-training with Multilingual Knowledge

1 code implementation • 26 Sep 2021 • Xiaoze Jiang, Yaobo Liang, Weizhu Chen, Nan Duan

The results on MLQA and NER exhibit the superiority of XLM-K in knowledge related tasks.

Language Modelling NER

Paper
Code

Discovering Representation Sprachbund For Multilingual Pre-Training

no code implementations • Findings (EMNLP) 2021 • Yimin Fan, Yaobo Liang, Alexandre Muzio, Hany Hassan, Houqiang Li, Ming Zhou, Nan Duan

Then we cluster all the target languages into multiple groups and name each group as a representation sprachbund.

Multilingual NLP

Paper
Add Code

Simpson's Bias in NLP Training

no code implementations • 13 Mar 2021 • Fei Yuan, Longtu Zhang, Huang Bojun, Yaobo Liang

In most machine learning tasks, we evaluate a model $M$ on a given data population $S$ by measuring a population-level metric $F(S;M)$.

Multi-class Classification Sentence +1

Paper
Add Code

Tag and Correct: Question aware Open Information Extraction with Two-stage Decoding

no code implementations • 16 Sep 2020 • Martin Kuo, Yaobo Liang, Lei Ji, Nan Duan, Linjun Shou, Ming Gong, Peng Chen

The semi-structured answer has two advantages which are more readable and falsifiable compared to span answer.

Decoder Open Information Extraction +1

Paper
Add Code

GLOW : Global Weighted Self-Attention Network for Web Search

1 code implementation • 10 Jul 2020 • Xuan Shan, Chuanjie Liu, Yiqian Xia, Qi Chen, Yusi Zhang, Kaize Ding, Yaobo Liang, Angen Luo, Yuxiang Luo

Deep matching models aim to facilitate search engines retrieving more relevant documents by mapping queries and documents into semantic vectors in the first-stage retrieval.

Document Ranking Information Retrieval +2

Paper
Code

Document Modeling with Graph Attention Networks for Multi-grained Machine Reading Comprehension

1 code implementation • ACL 2020 • Bo Zheng, Haoyang Wen, Yaobo Liang, Nan Duan, Wanxiang Che, Daxin Jiang, Ming Zhou, Ting Liu

Natural Questions is a new challenging machine reading comprehension benchmark with two-grained answers, which are a long answer (typically a paragraph) and a short answer (one or more entities inside the long answer).

Graph Attention Machine Reading Comprehension +1

Paper
Code

Enhancing Answer Boundary Detection for Multilingual Machine Reading Comprehension

no code implementations • ACL 2020 • Fei Yuan, Linjun Shou, Xuanyu Bai, Ming Gong, Yaobo Liang, Nan Duan, Yan Fu, Daxin Jiang

Multilingual pre-trained models could leverage the training data from a rich source language (such as English) to improve performance on low resource languages.

Boundary Detection Machine Reading Comprehension +2

Paper
Add Code

XGLUE: A New Benchmark Dataset for Cross-lingual Pre-training, Understanding and Generation

2 code implementations • 3 Apr 2020 • Yaobo Liang, Nan Duan, Yeyun Gong, Ning Wu, Fenfei Guo, Weizhen Qi, Ming Gong, Linjun Shou, Daxin Jiang, Guihong Cao, Xiaodong Fan, Ruofei Zhang, Rahul Agrawal, Edward Cui, Sining Wei, Taroon Bharti, Ying Qiao, Jiun-Hung Chen, Winnie Wu, Shuguang Liu, Fan Yang, Daniel Campos, Rangan Majumder, Ming Zhou

In this paper, we introduce XGLUE, a new benchmark dataset that can be used to train large-scale cross-lingual pre-trained models using multilingual and bilingual corpora and evaluate their performance across a diverse set of cross-lingual tasks.

Natural Language Understanding XLM-R

Paper
Code

Unicoder: A Universal Language Encoder by Pre-training with Multiple Cross-lingual Tasks

no code implementations • IJCNLP 2019 • Haoyang Huang, Yaobo Liang, Nan Duan, Ming Gong, Linjun Shou, Daxin Jiang, Ming Zhou

On XNLI, 1. 8% averaged accuracy improvement (on 15 languages) is obtained.

Cross-Lingual Natural Language Inference Cross-Lingual Question Answering +1

Paper
Add Code

Dense Procedure Captioning in Narrated Instructional Videos

no code implementations • ACL 2019 • Botian Shi, Lei Ji, Yaobo Liang, Nan Duan, Peng Chen, Zhendong Niu, Ming Zhou

Understanding narrated instructional videos is important for both research and real-world web applications.

Dense Captioning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.