Search Results for author: Jian-Guang Lou

Found 68 papers, 43 papers with code

Weakly Supervised Semantic Parsing by Learning from Mistakes

1 code implementation Findings (EMNLP) 2021 Jiaqi Guo, Jian-Guang Lou, Ting Liu, Dongmei Zhang

Using only 10% of utterance-denotation pairs, the parser achieves 84. 2 denotation accuracy on WikiSQL, which is competitive with the previous state-of-the-art approaches using 100% labeled data.

Semantic Parsing

``What Do You Mean by That?'' A Parser-Independent Interactive Approach for Enhancing Text-to-SQL

no code implementations EMNLP 2020 Yuntao Li, Bei Chen, Qian Liu, Yan Gao, Jian-Guang Lou, Yan Zhang, Dongmei Zhang

In Natural Language Interfaces to Databases systems, the text-to-SQL technique allows users to query databases by using natural language questions.

Text-To-SQL

GL-CLeF: A Global–Local Contrastive Learning Framework for Cross-lingual Spoken Language Understanding

1 code implementation ACL 2022 Libo Qin, Qiguang Chen, Tianbao Xie, Qixin Li, Jian-Guang Lou, Wanxiang Che, Min-Yen Kan

Specifically, we employ contrastive learning, leveraging bilingual dictionaries to construct multilingual views of the same utterance, then encourage their representations to be more similar than negative example pairs, which achieves to explicitly align representations of similar sentences across languages.

Contrastive Learning Cross-Lingual Transfer +2

Translating Headers of Tabular Data: A Pilot Study of Schema Translation

1 code implementation EMNLP 2021 Kunrui Zhu, Yan Gao, Jiaqi Guo, Jian-Guang Lou

Experiments on our dataset demonstrate that CAST significantly outperforms state-of-the-art neural machine translation models.

Machine Translation Translation

TWT: Table with Written Text for Controlled Data-to-Text Generation

no code implementations Findings (EMNLP) 2021 Tongliang Li, Lei Fang, Jian-Guang Lou, Zhoujun Li

In this paper, we propose to generate text conditioned on the structured data (table) and a prefix (the written text) by leveraging the pre-trained models.

Data-to-Text Generation

Make Your LLM Fully Utilize the Context

1 code implementation25 Apr 2024 Shengnan An, Zexiong Ma, Zeqi Lin, Nanning Zheng, Jian-Guang Lou

While many contemporary large language models (LLMs) can process lengthy input, they still struggle to fully utilize information within the long context, known as the lost-in-the-middle challenge.

4k Information Retrieval +1

Tapilot-Crossing: Benchmarking and Evolving LLMs Towards Interactive Data Analysis Agents

no code implementations8 Mar 2024 Jinyang Li, Nan Huo, Yan Gao, Jiayi Shi, Yingxiu Zhao, Ge Qu, Yurong Wu, Chenhao Ma, Jian-Guang Lou, Reynold Cheng

The challenges and costs of collecting realistic interactive logs for data analysis hinder the quantitative evaluation of Large Language Model (LLM) agents in this task.

Benchmarking Decision Making +2

Data Transformation to Construct a Dataset for Generating Entity-Relationship Model from Natural Language

no code implementations21 Dec 2023 Zhenwen Li, Jian-Guang Lou, Tao Xie

To address this issue, in this paper, we report our insight that there exists a high similarity between the task of NL2ERM and the increasingly popular task of text-to-SQL, and propose a data transformation algorithm that transforms the existing data of text-to-SQL into the data of NL2ERM.

Text-To-SQL

Thread of Thought Unraveling Chaotic Contexts

no code implementations15 Nov 2023 Yucheng Zhou, Xiubo Geng, Tao Shen, Chongyang Tao, Guodong Long, Jian-Guang Lou, Jianbing Shen

Large Language Models (LLMs) have ushered in a transformative era in the field of natural language processing, excelling in tasks related to text comprehension and generation.

Reading Comprehension

LayoutPrompter: Awaken the Design Ability of Large Language Models

1 code implementation NeurIPS 2023 Jiawei Lin, Jiaqi Guo, Shizhao Sun, Zijiang James Yang, Jian-Guang Lou, Dongmei Zhang

In this work, we propose LayoutPrompter, which leverages large language models (LLMs) to address the above problems through in-context learning.

In-Context Learning

Learning From Mistakes Makes LLM Better Reasoner

1 code implementation31 Oct 2023 Shengnan An, Zexiong Ma, Zeqi Lin, Nanning Zheng, Jian-Guang Lou, Weizhu Chen

To further improve their reasoning capabilities, this work explores whether LLMs can LEarn from MistAkes (LEMA), akin to the human learning process.

GSM8K Math +1

Re-Reading Improves Reasoning in Large Language Models

1 code implementation12 Sep 2023 Xiaohan Xu, Chongyang Tao, Tao Shen, Can Xu, Hongbo Xu, Guodong Long, Jian-Guang Lou

To enhance the reasoning capabilities of off-the-shelf Large Language Models (LLMs), we introduce a simple, yet general and effective prompting method, Re2, i. e., \textbf{Re}-\textbf{Re}ading the question as input.

A Parse-Then-Place Approach for Generating Graphic Layouts from Textual Descriptions

no code implementations ICCV 2023 Jiawei Lin, Jiaqi Guo, Shizhao Sun, Weijiang Xu, Ting Liu, Jian-Guang Lou, Dongmei Zhang

To model combined and incomplete constraints, we use a Transformer-based layout generation model and carefully design a way to represent constraints and layouts as sequences.

Uncovering and Categorizing Social Biases in Text-to-SQL

1 code implementation25 May 2023 Yan Liu, Yan Gao, Zhe Su, Xiaokang Chen, Elliott Ash, Jian-Guang Lou

In this work, we aim to uncover and categorize social biases in Text-to-SQL models.

Text-To-SQL

TACR: A Table-alignment-based Cell-selection and Reasoning Model for Hybrid Question-Answering

no code implementations24 May 2023 Jian Wu, Yicheng Xu, Yan Gao, Jian-Guang Lou, Börje F. Karlsson, Manabu Okumura

A common challenge in HQA and other passage-table QA datasets is that it is generally unrealistic to iterate over all table rows, columns, and linked passages to retrieve evidence.

Question Answering Retrieval

Skill-Based Few-Shot Selection for In-Context Learning

no code implementations23 May 2023 Shengnan An, Bo Zhou, Zeqi Lin, Qiang Fu, Bei Chen, Nanning Zheng, Weizhu Chen, Jian-Guang Lou

Few-shot selection -- selecting appropriate examples for each test instance separately -- is important for in-context learning.

In-Context Learning Semantic Parsing +1

Question Answering as Programming for Solving Time-Sensitive Questions

1 code implementation23 May 2023 Xinyu Zhu, Cheng Yang, Bei Chen, Siheng Li, Jian-Guang Lou, Yujiu Yang

Question answering plays a pivotal role in human daily life because it involves our acquisition of knowledge about the world.

Natural Language Understanding Question Answering

How Do In-Context Examples Affect Compositional Generalization?

no code implementations8 May 2023 Shengnan An, Zeqi Lin, Qiang Fu, Bei Chen, Nanning Zheng, Jian-Guang Lou, Dongmei Zhang

Compositional generalization--understanding unseen combinations of seen primitives--is an essential reasoning capability in human intelligence.

In-Context Learning

RepoCoder: Repository-Level Code Completion Through Iterative Retrieval and Generation

1 code implementation22 Mar 2023 Fengji Zhang, Bei Chen, Yue Zhang, Jacky Keung, Jin Liu, Daoguang Zan, Yi Mao, Jian-Guang Lou, Weizhu Chen

The task of repository-level code completion is to continue writing the unfinished code based on a broader context of the repository.

Code Completion Language Modelling +1

LayoutDiffusion: Improving Graphic Layout Generation by Discrete Diffusion Probabilistic Models

1 code implementation ICCV 2023 Junyi Zhang, Jiaqi Guo, Shizhao Sun, Jian-Guang Lou, Dongmei Zhang

To tackle the challenge, we summarize three critical factors for achieving a mild forward process for the layout, i. e., legality, coordinate proximity and type disruption.

Layout Design

Does Deep Learning Learn to Abstract? A Systematic Probing Framework

1 code implementation23 Feb 2023 Shengnan An, Zeqi Lin, Bei Chen, Qiang Fu, Nanning Zheng, Jian-Guang Lou

Abstraction is a desirable capability for deep learning models, which means to induce abstract concepts from concrete instances and flexibly apply them beyond the learning context.

Towards Knowledge-Intensive Text-to-SQL Semantic Parsing with Formulaic Knowledge

1 code implementation3 Jan 2023 Longxu Dou, Yan Gao, Xuqi Liu, Mingyang Pan, Dingzirui Wang, Wanxiang Che, Dechen Zhan, Min-Yen Kan, Jian-Guang Lou

In this paper, we study the problem of knowledge-intensive text-to-SQL, in which domain knowledge is necessary to parse expert questions into SQL queries over domain-specific tables.

Semantic Parsing Text-To-SQL

MultiSpider: Towards Benchmarking Multilingual Text-to-SQL Semantic Parsing

1 code implementation27 Dec 2022 Longxu Dou, Yan Gao, Mingyang Pan, Dingzirui Wang, Wanxiang Che, Dechen Zhan, Jian-Guang Lou

Text-to-SQL semantic parsing is an important NLP task, which greatly facilitates the interaction between users and the database and becomes the key component in many human-computer interaction systems.

Benchmarking Semantic Parsing +1

Towards Robustness of Text-to-SQL Models Against Natural and Realistic Adversarial Table Perturbation

1 code implementation ACL 2022 Xinyu Pi, Bing Wang, Yan Gao, Jiaqi Guo, Zhoujun Li, Jian-Guang Lou

The robustness of Text-to-SQL parsers against adversarial perturbations plays a crucial role in delivering highly reliable applications.

Text-To-SQL

Large Language Models Meet NL2Code: A Survey

no code implementations19 Dec 2022 Daoguang Zan, Bei Chen, Fengji Zhang, Dianjie Lu, Bingchao Wu, Bei guan, Yongji Wang, Jian-Guang Lou

The task of generating code from a natural language description, or NL2Code, is considered a pressing and significant challenge in code intelligence.

Know What I don't Know: Handling Ambiguous and Unanswerable Questions for Text-to-SQL

1 code implementation17 Dec 2022 Bing Wang, Yan Gao, Zhoujun Li, Jian-Guang Lou

Following this study, we propose a simple yet effective counterfactual example generation approach that automatically produces ambiguous and unanswerable text-to-SQL examples.

counterfactual Text-To-SQL

When Language Model Meets Private Library

1 code implementation31 Oct 2022 Daoguang Zan, Bei Chen, Zeqi Lin, Bei guan, Yongji Wang, Jian-Guang Lou

In this paper, we investigate how to equip pre-trained language models with the ability of code generation for private libraries.

Code Generation Language Modelling +1

LayoutFormer++: Conditional Graphic Layout Generation via Constraint Serialization and Decoding Space Restriction

no code implementations CVPR 2023 Zhaoyun Jiang, Jiaqi Guo, Shizhao Sun, Huayu Deng, Zhongkai Wu, Vuksan Mijovic, Zijiang James Yang, Jian-Guang Lou, Dongmei Zhang

First, to flexibly handle diverse constraints, we propose a constraint serialization scheme, which represents different user constraints as sequences of tokens with a predefined format.

CodeT: Code Generation with Generated Tests

1 code implementation21 Jul 2022 Bei Chen, Fengji Zhang, Anh Nguyen, Daoguang Zan, Zeqi Lin, Jian-Guang Lou, Weizhu Chen

A natural way to evaluate the quality and correctness of a code solution is to run it against a set of test cases, but the manual creation of such test cases is often costly and time-consuming.

 Ranked #1 on Code Generation on APPS (Introductory Pass@1 metric)

Code Generation

LogiGAN: Learning Logical Reasoning via Adversarial Pre-training

1 code implementation18 May 2022 Xinyu Pi, Wanjun Zhong, Yan Gao, Nan Duan, Jian-Guang Lou

We present LogiGAN, an unsupervised adversarial pre-training framework for improving logical reasoning abilities of language models.

Logical Reasoning Sentence

Leveraging Reaction-aware Substructures for Retrosynthesis Analysis

1 code implementation12 Apr 2022 Lei Fang, Junren Li, Ming Zhao, Li Tan, Jian-Guang Lou

In this paper, we propose a substructure-level decoding model, where the substructures are reaction-aware and can be automatically extracted with a fully data-driven approach.

Decision Making Machine Translation +2

UniSAr: A Unified Structure-Aware Autoregressive Language Model for Text-to-SQL

1 code implementation15 Mar 2022 Longxu Dou, Yan Gao, Mingyang Pan, Dingzirui Wang, Wanxiang Che, Dechen Zhan, Jian-Guang Lou

Existing text-to-SQL semantic parsers are typically designed for particular settings such as handling queries that span multiple tables, domains or turns which makes them ineffective when applied to different settings.

Language Modelling Text-To-SQL

Input-Tuning: Adapting Unfamiliar Inputs to Frozen Pretrained Models

no code implementations7 Mar 2022 Shengnan An, Yifei Li, Zeqi Lin, Qian Liu, Bei Chen, Qiang Fu, Weizhu Chen, Nanning Zheng, Jian-Guang Lou

This motivates us to propose input-tuning, which fine-tunes both the continuous prompts and the input representations, leading to a more effective way to adapt unfamiliar inputs to frozen PLMs.

Language Modelling Natural Language Understanding +1

Reasoning Like Program Executors

1 code implementation27 Jan 2022 Xinyu Pi, Qian Liu, Bei Chen, Morteza Ziyadi, Zeqi Lin, Qiang Fu, Yan Gao, Jian-Guang Lou, Weizhu Chen

Reasoning over natural language is a long-standing goal for the research community.

Ranked #2 on Question Answering on DROP Test (using extra training data)

Logical Reasoning Math +1

LEMON: Language-Based Environment Manipulation via Execution-Guided Pre-training

2 code implementations20 Jan 2022 Qi Shi, Qian Liu, Bei Chen, Yu Zhang, Ting Liu, Jian-Guang Lou

In this work, we propose LEMON, a general framework for language-based environment manipulation tasks.

Language Modelling

Part & Whole Extraction: Towards A Deep Understanding of Quantitative Facts for Percentages in Text

no code implementations26 Oct 2021 Lei Fang, Jian-Guang Lou

", our goal is to obtain a deep understanding of the percentage numbers ("30 percent" and "20%") by extracting their quantitative facts: part ("like watching football" and "prefer to watch NBA") and whole ("Americans).

named-entity-recognition Named Entity Recognition +2

Awakening Latent Grounding from Pretrained Language Models for Semantic Parsing

1 code implementation Findings (ACL) 2021 Qian Liu, Dejian Yang, Jiahui Zhang, Jiaqi Guo, Bin Zhou, Jian-Guang Lou

Recent years pretrained language models (PLMs) hit a success on several downstream tasks, showing their power on modeling language.

Semantic Parsing Text-To-SQL

HiTab: A Hierarchical Table Dataset for Question Answering and Natural Language Generation

1 code implementation ACL 2022 Zhoujun Cheng, Haoyu Dong, Zhiruo Wang, Ran Jia, Jiaqi Guo, Yan Gao, Shi Han, Jian-Guang Lou, Dongmei Zhang

HiTab provides 10, 686 QA pairs and descriptive sentences with well-annotated quantity and entity alignment on 3, 597 tables with broad coverage of table hierarchies and numerical reasoning types.

Descriptive Entity Alignment +2

Chase: A Large-Scale and Pragmatic Chinese Dataset for Cross-Database Context-Dependent Text-to-SQL

no code implementations ACL 2021 Jiaqi Guo, Ziliang Si, Yu Wang, Qian Liu, Ming Fan, Jian-Guang Lou, Zijiang Yang, Ting Liu

However, we identify two biases in existing datasets for XDTS: (1) a high proportion of context-independent questions and (2) a high proportion of easy SQL queries.

Text-To-SQL

TAPEX: Table Pre-training via Learning a Neural SQL Executor

2 code implementations ICLR 2022 Qian Liu, Bei Chen, Jiaqi Guo, Morteza Ziyadi, Zeqi Lin, Weizhu Chen, Jian-Guang Lou

TAPEX addresses the data scarcity challenge via guiding the language model to mimic a SQL executor on the diverse, large-scale and high-quality synthetic corpus.

 Ranked #1 on Semantic Parsing on WikiSQL (Denotation accuracy (test) metric)

Language Modelling Semantic Parsing +1

Iterative Utterance Segmentation for Neural Semantic Parsing

no code implementations13 Dec 2020 Yinuo Guo, Zeqi Lin, Jian-Guang Lou, Dongmei Zhang

Experiments on Geo, ComplexWebQuestions, and Formulas show that our framework can consistently improve performances of neural semantic parsers in different domains.

Segmentation Semantic Parsing

Revisiting Iterative Back-Translation from the Perspective of Compositional Generalization

no code implementations8 Dec 2020 Yinuo Guo, Hualei Zhu, Zeqi Lin, Bei Chen, Jian-Guang Lou, Dongmei Zhang

Human intelligence exhibits compositional generalization (i. e., the capacity to understand and produce unseen combinations of seen components), but current neural seq2seq models lack such ability.

Translation

"What Do You Mean by That?" A Parser-Independent Interactive Approach for Enhancing Text-to-SQL

1 code implementation9 Nov 2020 Yuntao Li, Bei Chen, Qian Liu, Yan Gao, Jian-Guang Lou, Yan Zhang, Dongmei Zhang

In Natural Language Interfaces to Databases systems, the text-to-SQL technique allows users to query databases by using natural language questions.

Text-To-SQL

UniTrans: Unifying Model Transfer and Data Transfer for Cross-Lingual Named Entity Recognition with Unlabeled Data

1 code implementation15 Jul 2020 Qianhui Wu, Zijia Lin, Börje F. Karlsson, Biqing Huang, Jian-Guang Lou

Prior works in cross-lingual named entity recognition (NER) with no/little labeled data fall into two primary categories: model transfer based and data transfer based methods.

Cross-Lingual NER Knowledge Distillation +4

Compositional Generalization by Learning Analytical Expressions

1 code implementation NeurIPS 2020 Qian Liu, Shengnan An, Jian-Guang Lou, Bei Chen, Zeqi Lin, Yan Gao, Bin Zhou, Nanning Zheng, Dongmei Zhang

Compositional generalization is a basic and essential intellective capability of human beings, which allows us to recombine known parts readily.

Hierarchical Reinforcement Learning

Single-/Multi-Source Cross-Lingual NER via Teacher-Student Learning on Unlabeled Data in Target Language

1 code implementation ACL 2020 Qianhui Wu, Zijia Lin, Börje F. Karlsson, Jian-Guang Lou, Biqing Huang

However, such methods either are not applicable if the labeled data in the source languages is unavailable, or do not leverage information contained in unlabeled data in the target language.

Cross-Lingual NER named-entity-recognition +2

You Impress Me: Dialogue Generation via Mutual Persona Perception

1 code implementation ACL 2020 Qian Liu, Yihong Chen, Bei Chen, Jian-Guang Lou, Zixuan Chen, Bin Zhou, Dongmei Zhang

Despite the continuing efforts to improve the engagingness and consistency of chit-chat dialogue systems, the majority of current work simply focus on mimicking human-like responses, leaving understudied the aspects of modeling understanding between interlocutors.

Ranked #2 on Dialogue Generation on Persona-Chat (using extra training data)

Dialogue Generation

How Far are We from Effective Context Modeling? An Exploratory Study on Semantic Parsing in Context

1 code implementation3 Feb 2020 Qian Liu, Bei Chen, Jiaqi Guo, Jian-Guang Lou, Bin Zhou, Dongmei Zhang

Recently semantic parsing in context has received considerable attention, which is challenging since there are complex contextual phenomena.

Semantic Parsing

Data-Anonymous Encoding for Text-to-SQL Generation

no code implementations IJCNLP 2019 Zhen Dong, Shizhao Sun, Hongzhi Liu, Jian-Guang Lou, Dongmei Zhang

On text-to-SQL generation, the input utterance usually contains lots of tokens that are related to column names or cells in the table, called \textit{table-related tokens}.

Text-To-SQL

A Hybrid Semantic Parsing Approach for Tabular Data Analysis

no code implementations23 Oct 2019 Yan Gao, Jian-Guang Lou, Dongmei Zhang

This paper presents a novel approach to translating natural language questions to SQL queries for given tables, which meets three requirements as a real-world data analysis application: cross-domain, multilingualism and enabling quick-start.

Semantic Parsing

A Split-and-Recombine Approach for Follow-up Query Analysis

1 code implementation IJCNLP 2019 Qian Liu, Bei Chen, Haoyan Liu, Lei Fang, Jian-Guang Lou, Bin Zhou, Dongmei Zhang

To leverage the advances in context-independent semantic parsing, we propose to perform follow-up query analysis, aiming to restate context-dependent natural language queries with contextual information.

Natural Language Queries Semantic Parsing

LambdaOpt: Learn to Regularize Recommender Models in Finer Levels

1 code implementation28 May 2019 Yihong Chen, Bei Chen, Xiangnan He, Chen Gao, Yong Li, Jian-Guang Lou, Yue Wang

We show how to employ LambdaOpt on matrix factorization, a classical model that is representative of a large family of recommender models.

Hyperparameter Optimization Recommendation Systems

FANDA: A Novel Approach to Perform Follow-up Query Analysis

1 code implementation24 Jan 2019 Qian Liu, Bei Chen, Jian-Guang Lou, Ge Jin, Dongmei Zhang

NLIDB allow users to search databases using natural language instead of SQL-like query languages.

Learning-to-Ask: Knowledge Acquisition via 20 Questions

no code implementations22 Jun 2018 Yihong Chen, Bei Chen, Xuguang Duan, Jian-Guang Lou, Yue Wang, Wenwu Zhu, Yong Cao

Almost all the knowledge empowered applications rely upon accurate knowledge, which has to be either collected manually with high cost, or extracted automatically with unignorable errors.

Cannot find the paper you are looking for? You can Submit a new open access paper.