Search Results for author: Caiming Xiong

Found 276 papers, 144 papers with code

Continual Learning via Explicit Structure Learning

no code implementations • ICLR 2019 • Xilai Li, Yingbo Zhou, Tianfu Wu, Richard Socher, Caiming Xiong

During structure learning, the model optimizes for the best structure for the current task.

Paper
Add Code

Are Pre-trained Transformers Robust in Intent Classification? A Missing Ingredient in Evaluation of Out-of-Scope Intent Detection

no code implementations • NLP4ConvAI (ACL) 2022 • JianGuo Zhang, Kazuma Hashimoto, Yao Wan, Zhiwei Liu, Ye Liu, Caiming Xiong, Philip Yu

Pre-trained Transformer-based models were reported to be robust in intent classification.

intent-classification Intent Classification +2

Paper
Add Code

Few-Shot Intent Classification by Gauging Entailment Relationship Between Utterance and Semantic Label

no code implementations • EMNLP (NLP4ConvAI) 2021 • Jin Qu, Kazuma Hashimoto, Wenhao Liu, Caiming Xiong, Yingbo Zhou

Compared with DNNC, our proposed method is more efficient in both training and serving since it is based upon the entailment between query utterance and labels instead of all the training examples.

Classification intent-classification +2

Paper
Add Code

The Thieves on Sesame Street are Polyglots - Extracting Multilingual Models from Monolingual APIs

no code implementations • EMNLP 2020 • Nitish Shirish Keskar, Bryan McCann, Caiming Xiong, Richard Socher

Pre-training in natural language processing makes it easier for an adversary with only query access to a victim model to reconstruct a local copy of the victim by training with gibberish input data paired with the victim{'}s labels for that data.

Paper
Add Code

Simple Data Augmentation with the Mask Token Improves Domain Adaptation for Dialog Act Tagging

no code implementations • EMNLP 2020 • Semih Yavuz, Kazuma Hashimoto, Wenhao Liu, Nitish Shirish Keskar, Richard Socher, Caiming Xiong

The concept of Dialogue Act (DA) is universal across different task-oriented dialogue domains - the act of {``}request{''} carries the same speaker intention whether it is for restaurant reservation or flight booking.

Data Augmentation Domain Generalization

Paper
Add Code

[CASPI] Causal-aware Safe Policy Improvement for Task-oriented Dialogue

no code implementations • ACL 2022 • Govardana Sachithanandam Ramachandran, Kazuma Hashimoto, Caiming Xiong

Further more we demonstrate sample efficiency, where our method trained only on 20% of the data, are comparable to current state of the art method trained on 100% data on two out of there evaluation metrics.

Dialogue Management Management +1

Paper
Add Code

BatchMixup: Improving Training by Interpolating Hidden States of the Entire Mini-batch

no code implementations • Findings (ACL) 2021 • Wenpeng Yin, Huan Wang, Jin Qu, Caiming Xiong

Paper
Add Code

DocQueryNet: Value Retrieval with Arbitrary Queries for Form-like Documents

1 code implementation • COLING 2022 • Mingfei Gao, Le Xue, Chetan Ramaiah, Chen Xing, ran Xu, Caiming Xiong

Unlike previous methods that only address a fixed set of field items, our method predicts target value for an arbitrary query based on the understanding of the layout and semantics of a form.

document understanding Language Modelling +1

Paper
Code

OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

no code implementations • 11 Apr 2024 • Tianbao Xie, Danyang Zhang, Jixuan Chen, Xiaochuan Li, Siheng Zhao, Ruisheng Cao, Toh Jing Hua, Zhoujun Cheng, Dongchan Shin, Fangyu Lei, Yitao Liu, Yiheng Xu, Shuyan Zhou, Silvio Savarese, Caiming Xiong, Victor Zhong, Tao Yu

Autonomous agents that accomplish complex computer tasks with minimal human interventions have the potential to transform human-computer interaction, significantly enhancing accessibility and productivity.

Benchmarking

Paper
Add Code

What Are We Measuring When We Evaluate Large Vision-Language Models? An Analysis of Latent Factors and Biases

1 code implementation • 3 Apr 2024 • Anthony Meng Huat Tiong, Junqi Zhao, Boyang Li, Junnan Li, Steven C. H. Hoi, Caiming Xiong

Vision-language (VL) models, pretrained on colossal image-text datasets, have attained broad VL competence that is difficult to evaluate.

Transfer Learning

Paper
Code

How Much are LLMs Contaminated? A Comprehensive Survey and the LLMSanitize Library

1 code implementation • 31 Mar 2024 • Mathieu Ravaut, Bosheng Ding, Fangkai Jiao, Hailin Chen, Xingxuan Li, Ruochen Zhao, Chengwei Qin, Caiming Xiong, Shafiq Joty

With the rise of Large Language Models (LLMs) in recent years, new opportunities are emerging, but also new challenges, and contamination is quickly becoming critical.

Question Answering

Paper
Code

FOFO: A Benchmark to Evaluate LLMs' Format-Following Capability

1 code implementation • 28 Feb 2024 • Congying Xia, Chen Xing, Jiangshu Du, Xinyi Yang, Yihao Feng, ran Xu, Wenpeng Yin, Caiming Xiong

This paper presents FoFo, a pioneering benchmark for evaluating large language models' (LLMs) ability to follow complex, domain-specific formats, a crucial yet underexamined capability for their application as AI agents.

Paper
Code

AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning

2 code implementations • 23 Feb 2024 • JianGuo Zhang, Tian Lan, Rithesh Murthy, Zhiwei Liu, Weiran Yao, Juntao Tan, Thai Hoang, Liangwei Yang, Yihao Feng, Zuxin Liu, Tulika Awalgaonkar, Juan Carlos Niebles, Silvio Savarese, Shelby Heinecke, Huan Wang, Caiming Xiong

It meticulously standardizes and unifies these trajectories into a consistent format, streamlining the creation of a generic data loader optimized for agent training.

Paper
Code

AgentLite: A Lightweight Library for Building and Advancing Task-Oriented LLM Agent System

1 code implementation • 23 Feb 2024 • Zhiwei Liu, Weiran Yao, JianGuo Zhang, Liangwei Yang, Zuxin Liu, Juntao Tan, Prafulla K. Choubey, Tian Lan, Jason Wu, Huan Wang, Shelby Heinecke, Caiming Xiong, Silvio Savarese

Thus, we open-source a new AI agent library, AgentLite, which simplifies this process by offering a lightweight, user-friendly platform for innovating LLM agent reasoning, architectures, and applications with ease.

274

Paper
Code

Unified Training of Universal Time Series Forecasting Transformers

1 code implementation • 4 Feb 2024 • Gerald Woo, Chenghao Liu, Akshat Kumar, Caiming Xiong, Silvio Savarese, Doyen Sahoo

Deep learning for time series forecasting has traditionally operated within a one-model-per-dataset framework, limiting its potential to leverage the game-changing impact of large pre-trained models.

Time Series Time Series Forecasting

391

Paper
Code

Causal Layering via Conditional Entropy

no code implementations • 19 Jan 2024 • Itai Feigenbaum, Devansh Arpit, Huan Wang, Shelby Heinecke, Juan Carlos Niebles, Weiran Yao, Caiming Xiong, Silvio Savarese

Under appropriate assumptions and conditioning, we can separate the sources or sinks from the remainder of the nodes by comparing their conditional entropy to the unconditional entropy of their noise.

Causal Discovery

Paper
Add Code

Editing Arbitrary Propositions in LLMs without Subject Labels

no code implementations • 15 Jan 2024 • Itai Feigenbaum, Devansh Arpit, Huan Wang, Shelby Heinecke, Juan Carlos Niebles, Weiran Yao, Caiming Xiong, Silvio Savarese

On datasets of binary propositions derived from the CounterFact dataset, we show that our method -- without access to subject labels -- performs close to state-of-the-art L\&E methods which has access subject labels.

Language Modelling Large Language Model +1

Paper
Add Code

Parameter-Efficient Detoxification with Contrastive Decoding

no code implementations • 13 Jan 2024 • Tong Niu, Caiming Xiong, Semih Yavuz, Yingbo Zhou

DETOXIGEN is an ensemble of a pre-trained language model (generator) and a detoxifier.

Attribute Language Modelling +1

Paper
Add Code

TrustLLM: Trustworthiness in Large Language Models

1 code implementation • 10 Jan 2024 • Lichao Sun, Yue Huang, Haoran Wang, Siyuan Wu, Qihui Zhang, Yuan Li, Chujie Gao, Yixin Huang, Wenhan Lyu, Yixuan Zhang, Xiner Li, Zhengliang Liu, Yixin Liu, Yijue Wang, Zhikun Zhang, Bertie Vidgen, Bhavya Kailkhura, Caiming Xiong, Chaowei Xiao, Chunyuan Li, Eric Xing, Furong Huang, Hao liu, Heng Ji, Hongyi Wang, huan zhang, Huaxiu Yao, Manolis Kellis, Marinka Zitnik, Meng Jiang, Mohit Bansal, James Zou, Jian Pei, Jian Liu, Jianfeng Gao, Jiawei Han, Jieyu Zhao, Jiliang Tang, Jindong Wang, Joaquin Vanschoren, John Mitchell, Kai Shu, Kaidi Xu, Kai-Wei Chang, Lifang He, Lifu Huang, Michael Backes, Neil Zhenqiang Gong, Philip S. Yu, Pin-Yu Chen, Quanquan Gu, ran Xu, Rex Ying, Shuiwang Ji, Suman Jana, Tianlong Chen, Tianming Liu, Tianyi Zhou, William Wang, Xiang Li, Xiangliang Zhang, Xiao Wang, Xing Xie, Xun Chen, Xuyu Wang, Yan Liu, Yanfang Ye, Yinzhi Cao, Yong Chen, Yue Zhao

This paper introduces TrustLLM, a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions.

Ethics Fairness

271

Paper
Code

Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions

1 code implementation • 3 Jan 2024 • David Junhao Zhang, Dongxu Li, Hung Le, Mike Zheng Shou, Caiming Xiong, Doyen Sahoo

This work presents Moonshot, a new video generation model that conditions simultaneously on multimodal inputs of image and text.

Image Animation Video Editing +1

8,722

Paper
Code

Unlocking Anticipatory Text Generation: A Constrained Approach for Large Language Models Decoding

no code implementations • 11 Dec 2023 • Lifu Tu, Semih Yavuz, Jin Qu, Jiacheng Xu, Rui Meng, Caiming Xiong, Yingbo Zhou

Large Language Models (LLMs) have demonstrated a powerful ability for text generation.

Question Answering Text Generation

Paper
Add Code

X-InstructBLIP: A Framework for aligning X-Modal instruction-aware representations to LLMs and Emergent Cross-modal Reasoning

1 code implementation • 30 Nov 2023 • Artemis Panagopoulou, Le Xue, Ning Yu, Junnan Li, Dongxu Li, Shafiq Joty, ran Xu, Silvio Savarese, Caiming Xiong, Juan Carlos Niebles

Vision-language pre-training and instruction tuning have demonstrated general-purpose capabilities in 2D visual reasoning tasks by aligning visual encoders with state-of-the-art large language models (LLMs).

Visual Reasoning

Paper
Code

ChatGPT's One-year Anniversary: Are Open-Source Large Language Models Catching up?

1 code implementation • 28 Nov 2023 • Hailin Chen, Fangkai Jiao, Xingxuan Li, Chengwei Qin, Mathieu Ravaut, Ruochen Zhao, Caiming Xiong, Shafiq Joty

Upon its release in late 2022, ChatGPT has brought a seismic shift in the entire landscape of AI, both in research and commerce.

Language Modelling Large Language Model

Paper
Code

Diffusion Model Alignment Using Direct Preference Optimization

no code implementations • 21 Nov 2023 • Bram Wallace, Meihua Dang, Rafael Rafailov, Linqi Zhou, Aaron Lou, Senthil Purushwalkam, Stefano Ermon, Caiming Xiong, Shafiq Joty, Nikhil Naik

Large language models (LLMs) are fine-tuned using human comparison data with Reinforcement Learning from Human Feedback (RLHF) methods to make them better aligned with users' preferences.

Paper
Add Code

Lexical Repetitions Lead to Rote Learning: Unveiling the Impact of Lexical Overlap in Train and Test Reference Summaries

no code implementations • 15 Nov 2023 • Prafulla Kumar Choubey, Alexander R. Fabbri, Caiming Xiong, Chien-Sheng Wu

Ideal summarization models should generalize to novel summary-worthy content without remembering reference training summaries by rote.

Paper
Add Code

Fair Abstractive Summarization of Diverse Perspectives

1 code implementation • 14 Nov 2023 • Yusen Zhang, Nan Zhang, Yixin Liu, Alexander Fabbri, Junru Liu, Ryo Kamoi, Xiaoxin Lu, Caiming Xiong, Jieyu Zhao, Dragomir Radev, Kathleen McKeown, Rui Zhang

However, current work in summarization metrics and Large Language Models (LLMs) evaluation has not explored fair abstractive summarization.

Abstractive Text Summarization Fairness

Paper
Code

Are You Sure? Challenging LLMs Leads to Performance Drops in The FlipFlop Experiment

no code implementations • 14 Nov 2023 • Philippe Laban, Lidiya Murakhovs'ka, Caiming Xiong, Chien-Sheng Wu

The interactive nature of Large Language Models (LLMs) theoretically allows models to refine and improve their answers, yet systematic analysis of the multi-turn behavior of LLMs remains limited.

Paper
Add Code

Salespeople vs SalesBot: Exploring the Role of Educational Value in Conversational Recommender Systems

1 code implementation • 26 Oct 2023 • Lidiya Murakhovs'ka, Philippe Laban, Tian Xie, Caiming Xiong, Chien-Sheng Wu

Making big purchases requires consumers to research or consult a salesperson to gain domain expertise.

Informativeness Recommendation Systems

Paper
Code

OpenAgents: An Open Platform for Language Agents in the Wild

2 code implementations • 16 Oct 2023 • Tianbao Xie, Fan Zhou, Zhoujun Cheng, Peng Shi, Luoxuan Weng, Yitao Liu, Toh Jing Hua, Junning Zhao, Qian Liu, Che Liu, Leo Z. Liu, Yiheng Xu, Hongjin Su, Dongchan Shin, Caiming Xiong, Tao Yu

Language agents show potential in being capable of utilizing natural language for varied and intricate tasks in diverse environments, particularly when built upon large language models (LLMs).

2D Object Detection

3,400

Paper
Code

How Do Transformers Learn In-Context Beyond Simple Functions? A Case Study on Learning with Representations

no code implementations • 16 Oct 2023 • Tianyu Guo, Wei Hu, Song Mei, Huan Wang, Caiming Xiong, Silvio Savarese, Yu Bai

Through extensive probing and a new pasting experiment, we further reveal several mechanisms within the trained transformers, such as concrete copying behaviors on both the inputs and the representations, linear ICL capability of the upper layers alone, and a post-ICL representation selection mechanism in a harder mixture setting.

In-Context Learning

Paper
Add Code

Lemur: Harmonizing Natural Language and Code for Language Agents

1 code implementation • 10 Oct 2023 • Yiheng Xu, Hongjin Su, Chen Xing, Boyu Mi, Qian Liu, Weijia Shi, Binyuan Hui, Fan Zhou, Yitao Liu, Tianbao Xie, Zhoujun Cheng, Siheng Zhao, Lingpeng Kong, Bailin Wang, Caiming Xiong, Tao Yu

We introduce Lemur and Lemur-Chat, openly accessible language models optimized for both natural language and coding capabilities to serve as the backbone of versatile language agents.

516

Paper
Code

L2CEval: Evaluating Language-to-Code Generation Capabilities of Large Language Models

no code implementations • 29 Sep 2023 • Ansong Ni, Pengcheng Yin, Yilun Zhao, Martin Riddell, Troy Feng, Rui Shen, Stephen Yin, Ye Liu, Semih Yavuz, Caiming Xiong, Shafiq Joty, Yingbo Zhou, Dragomir Radev, Arman Cohan

Recently, large language models (LLMs), especially those that are pretrained on code, have demonstrated strong capabilities in generating programs from natural language inputs in a few-shot or even zero-shot manner.

Code Generation Math +1

Paper
Add Code

Beyond the Chat: Executable and Verifiable Text-Editing with LLMs

no code implementations • 27 Sep 2023 • Philippe Laban, Jesse Vig, Marti A. Hearst, Caiming Xiong, Chien-Sheng Wu

Conversational interfaces powered by Large Language Models (LLMs) have recently become a popular way to obtain feedback during document editing.

Paper
Add Code

Embrace Divergence for Richer Insights: A Multi-document Summarization Benchmark and a Case Study on Summarizing Diverse Information from News Articles

1 code implementation • 17 Sep 2023 • Kung-Hsiang Huang, Philippe Laban, Alexander R. Fabbri, Prafulla Kumar Choubey, Shafiq Joty, Caiming Xiong, Chien-Sheng Wu

In this paper, we propose a new task of summarizing diverse information encountered in multiple news articles encompassing the same event.

Document Summarization Language Modelling +3

Paper
Code

XGen-7B Technical Report

1 code implementation • 7 Sep 2023 • Erik Nijkamp, Tian Xie, Hiroaki Hayashi, Bo Pang, Congying Xia, Chen Xing, Jesse Vig, Semih Yavuz, Philippe Laban, Ben Krause, Senthil Purushwalkam, Tong Niu, Wojciech Kryściński, Lidiya Murakhovs'ka, Prafulla Kumar Choubey, Alex Fabbri, Ye Liu, Rui Meng, Lifu Tu, Meghana Bhat, Chien-Sheng Wu, Silvio Savarese, Yingbo Zhou, Shafiq Joty, Caiming Xiong

Most open-source LLMs, on the other hand, are limited in their ability to support longer sequence lengths, which is a key requirement for many tasks that require inference over an input context.

2k 8k

712

Paper
Code

Modeling Uncertainty and Using Post-fusion as Fallback Improves Retrieval Augmented Generation with LLMs

no code implementations • 24 Aug 2023 • Ye Liu, Semih Yavuz, Rui Meng, Meghana Moorthy, Shafiq Joty, Caiming Xiong, Yingbo Zhou

This paper aims to fill this gap by investigating different methods of combining retrieved passages with LLMs to enhance answer generation.

Answer Generation Open-Domain Question Answering +1

Paper
Add Code

Enhancing Performance on Seen and Unseen Dialogue Scenarios using Retrieval-Augmented End-to-End Task-Oriented System

no code implementations • 16 Aug 2023 • JianGuo Zhang, Stephen Roller, Kun Qian, Zhiwei Liu, Rui Meng, Shelby Heinecke, Huan Wang, Silvio Savarese, Caiming Xiong

End-to-end task-oriented dialogue (TOD) systems have achieved promising performance by leveraging sophisticated natural language understanding and natural language generation capabilities of pre-trained models.

Natural Language Understanding Retrieval +1

Paper
Add Code

BOLAA: Benchmarking and Orchestrating LLM-augmented Autonomous Agents

2 code implementations • 11 Aug 2023 • Zhiwei Liu, Weiran Yao, JianGuo Zhang, Le Xue, Shelby Heinecke, Rithesh Murthy, Yihao Feng, Zeyuan Chen, Juan Carlos Niebles, Devansh Arpit, ran Xu, Phil Mui, Huan Wang, Caiming Xiong, Silvio Savarese

The massive successes of large language models (LLMs) encourage the emerging exploration of LLM-augmented Autonomous Agents (LAAs).

Benchmarking Decision Making

274

Paper
Code

Retroformer: Retrospective Large Language Agents with Policy Gradient Optimization

no code implementations • 4 Aug 2023 • Weiran Yao, Shelby Heinecke, Juan Carlos Niebles, Zhiwei Liu, Yihao Feng, Le Xue, Rithesh Murthy, Zeyuan Chen, JianGuo Zhang, Devansh Arpit, ran Xu, Phil Mui, Huan Wang, Caiming Xiong, Silvio Savarese

This demonstrates that using policy gradient optimization to improve language agents, for which we believe our work is one of the first, seems promising and can be applied to optimize other models in the agent architecture to enhance agent performances over time.

Language Modelling

Paper
Add Code

DialogStudio: Towards Richest and Most Diverse Unified Dataset Collection for Conversational AI

1 code implementation • 19 Jul 2023 • JianGuo Zhang, Kun Qian, Zhiwei Liu, Shelby Heinecke, Rui Meng, Ye Liu, Zhou Yu, Huan Wang, Silvio Savarese, Caiming Xiong

Despite advancements in conversational AI, language models encounter challenges to handle diverse conversational tasks, and existing dialogue dataset collections often lack diversity and comprehensiveness.

Few-Shot Learning Language Modelling +1

432

Paper
Code

REX: Rapid Exploration and eXploitation for AI Agents

no code implementations • 18 Jul 2023 • Rithesh Murthy, Shelby Heinecke, Juan Carlos Niebles, Zhiwei Liu, Le Xue, Weiran Yao, Yihao Feng, Zeyuan Chen, Akash Gokul, Devansh Arpit, ran Xu, Phil Mui, Huan Wang, Caiming Xiong, Silvio Savarese

In this paper, we propose an enhanced approach for Rapid Exploration and eXploitation for AI Agents called REX.

Decision Making Reinforcement Learning (RL)

Paper
Add Code

Sample-Efficient Learning of POMDPs with Multiple Observations In Hindsight

no code implementations • 6 Jul 2023 • Jiacheng Guo, Minshuo Chen, Huan Wang, Caiming Xiong, Mengdi Wang, Yu Bai

This paper studies the sample-efficiency of learning in Partially Observable Markov Decision Processes (POMDPs), a challenging problem in reinforcement learning that is known to be exponentially hard in the worst-case.

Paper
Add Code

Did You Read the Instructions? Rethinking the Effectiveness of Task Definitions in Instruction Learning

1 code implementation • 1 Jun 2023 • Fan Yin, Jesse Vig, Philippe Laban, Shafiq Joty, Caiming Xiong, Chien-Sheng Jason Wu

Large language models (LLMs) have shown impressive performance in following natural language instructions to solve unseen tasks.

Paper
Code

SWiPE: A Dataset for Document-Level Simplification of Wikipedia Pages

1 code implementation • 30 May 2023 • Philippe Laban, Jesse Vig, Wojciech Kryscinski, Shafiq Joty, Caiming Xiong, Chien-Sheng Wu

Text simplification research has mostly focused on sentence-level simplification, even though many desirable edits - such as adding relevant background information or reordering content - may require document-level context.

Sentence Text Simplification

Paper
Code

LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond

1 code implementation • 23 May 2023 • Philippe Laban, Wojciech Kryściński, Divyansh Agarwal, Alexander R. Fabbri, Caiming Xiong, Shafiq Joty, Chien-Sheng Wu

To address this, we propose a new protocol for inconsistency detection benchmark creation and implement it in a 10-domain benchmark called SummEdits.

Misinformation

Paper
Code

UniControl: A Unified Diffusion Model for Controllable Visual Generation In the Wild

1 code implementation • NeurIPS 2023 • Can Qin, Shu Zhang, Ning Yu, Yihao Feng, Xinyi Yang, Yingbo Zhou, Huan Wang, Juan Carlos Niebles, Caiming Xiong, Silvio Savarese, Stefano Ermon, Yun Fu, ran Xu

Visual generative foundation models such as Stable Diffusion show promise in navigating these goals, especially when prompted with arbitrary languages.

Image Generation

577

Paper
Code

ULIP-2: Towards Scalable Multimodal Pre-training for 3D Understanding

1 code implementation • 14 May 2023 • Le Xue, Ning Yu, Shu Zhang, Junnan Li, Roberto Martín-Martín, Jiajun Wu, Caiming Xiong, ran Xu, Juan Carlos Niebles, Silvio Savarese

Recent advancements in multimodal pre-training methods have shown promising efficacy in 3D representation learning by aligning multimodal features across 3D shapes, their 2D counterparts, and language descriptions.

Ranked #6 on 3D Point Cloud Classification on ScanObjectNN (using extra training data)

3D Point Cloud Classification Representation Learning +1

354

Paper
Code

HPE:Answering Complex Questions over Text by Hybrid Question Parsing and Execution

no code implementations • 12 May 2023 • Ye Liu, Semih Yavuz, Rui Meng, Dragomir Radev, Caiming Xiong, Yingbo Zhou

It comprises two central pillars: (1) We parse the question of varying complexity into an intermediate representation, named H-expression, which is composed of simple questions as the primitives and symbolic operations representing the relationships among them; (2) To execute the resulting H-expressions, we design a hybrid executor, which integrates the deterministic rules to translate the symbolic operations with a drop-in neural reader network to answer each decomposed simple question.

Knowledge Graphs Question Answering +1

Paper
Add Code

Zero-shot Item-based Recommendation via Multi-task Product Knowledge Graph Pre-Training

no code implementations • 12 May 2023 • Ziwei Fan, Zhiwei Liu, Shelby Heinecke, JianGuo Zhang, Huan Wang, Caiming Xiong, Philip S. Yu

This paper presents a novel paradigm for the Zero-Shot Item-based Recommendation (ZSIR) task, which pre-trains a model on product knowledge graph (PKG) to refine the item features from PLMs.

Recommendation Systems

Paper
Add Code

CodeGen2: Lessons for Training LLMs on Programming and Natural Languages

2 code implementations • 3 May 2023 • Erik Nijkamp, Hiroaki Hayashi, Caiming Xiong, Silvio Savarese, Yingbo Zhou

In this study, we attempt to render the training of LLMs for program synthesis more efficient by unifying four key components: (1) model architectures, (2) learning methods, (3) infill sampling, and, (4) data distributions.

Causal Language Modeling Language Modelling +2

4,762

Paper
Code

Efficiently Aligned Cross-Lingual Transfer Learning for Conversational Tasks using Prompt-Tuning

1 code implementation • 3 Apr 2023 • Lifu Tu, Jin Qu, Semih Yavuz, Shafiq Joty, Wenhao Liu, Caiming Xiong, Yingbo Zhou

Our results demonstrate the strong and efficient modeling ability of NLI-based classifiers and the large cross-lingual transfer improvements achieved by our aligned prompts, particularly in few-shot settings.

Cross-Lingual Transfer intent-classification +4

Paper
Code

GlueGen: Plug and Play Multi-modal Encoders for X-to-image Generation

1 code implementation • ICCV 2023 • Can Qin, Ning Yu, Chen Xing, Shu Zhang, Zeyuan Chen, Stefano Ermon, Yun Fu, Caiming Xiong, ran Xu

Empirical results show that GlueNet can be trained efficiently and enables various capabilities beyond previous state-of-the-art models: 1) multilingual language models such as XLM-Roberta can be aligned with existing T2I models, allowing for the generation of high-quality images from captions beyond English; 2) GlueNet can align multi-modal encoders such as AudioCLIP with the Stable Diffusion model, enabling sound-to-image generation; 3) it can also upgrade the current text encoder of the latent diffusion model for challenging case generation.

Image Generation

Paper
Code

HIVE: Harnessing Human Feedback for Instructional Visual Editing

1 code implementation • 16 Mar 2023 • Shu Zhang, Xinyi Yang, Yihao Feng, Can Qin, Chia-Chih Chen, Ning Yu, Zeyuan Chen, Huan Wang, Silvio Savarese, Stefano Ermon, Caiming Xiong, ran Xu

Incorporating human feedback has been shown to be crucial to align text generated by large language models to human preferences.

Text-based Image Editing

Paper
Code

On the Unlikelihood of D-Separation

no code implementations • 10 Mar 2023 • Itai Feigenbaum, Huan Wang, Shelby Heinecke, Juan Carlos Niebles, Weiran Yao, Caiming Xiong, Devansh Arpit

We then provide an analytic average case analysis of the PC Algorithm for causal discovery, as well as a variant of the SGS Algorithm we call UniformSGS.

Causal Discovery

Paper
Add Code

Towards Interpretable and Efficient Automatic Reference-Based Summarization Evaluation

1 code implementation • 7 Mar 2023 • Yixin Liu, Alexander R. Fabbri, Yilun Zhao, PengFei Liu, Shafiq Joty, Chien-Sheng Wu, Caiming Xiong, Dragomir Radev

Interpretability and efficiency are two important considerations for the adoption of neural automatic metrics.

Paper
Code

Fantastic Rewards and How to Tame Them: A Case Study on Reward Learning for Task-oriented Dialogue Systems

2 code implementations • 20 Feb 2023 • Yihao Feng, Shentao Yang, Shujian Zhang, JianGuo Zhang, Caiming Xiong, Mingyuan Zhou, Huan Wang

Prior works mainly focus on adopting advanced RL techniques to train the ToD agents, while the design of the reward function is not well studied.

Learning-To-Rank Reinforcement Learning (RL) +2

819

Paper
Code

A Comprehensive Survey on Pretrained Foundation Models: A History from BERT to ChatGPT

no code implementations • 18 Feb 2023 • Ce Zhou, Qian Li, Chen Li, Jun Yu, Yixin Liu, Guangjing Wang, Kai Zhang, Cheng Ji, Qiben Yan, Lifang He, Hao Peng, JianXin Li, Jia Wu, Ziwei Liu, Pengtao Xie, Caiming Xiong, Jian Pei, Philip S. Yu, Lichao Sun

This study provides a comprehensive review of recent research advancements, challenges, and opportunities for PFMs in text, image, graph, as well as other data modalities.

Graph Learning Language Modelling +1

Paper
Add Code

Designing and Evaluating Interfaces that Highlight News Coverage Diversity Using Discord Questions

no code implementations • 17 Feb 2023 • Philippe Laban, Chien-Sheng Wu, Lidiya Murakhovs'ka, Xiang 'Anthony' Chen, Caiming Xiong

In a second usability study, we developed and implemented a reading exercise with 95 novice news readers to measure exposure to coverage diversity.

Paper
Add Code

Improved Online Conformal Prediction via Strongly Adaptive Online Learning

2 code implementations • 15 Feb 2023 • Aadyot Bhatnagar, Huan Wang, Caiming Xiong, Yu Bai

We prove that our methods achieve near-optimal strongly adaptive regret for all interval lengths simultaneously, and approximately valid coverage.

Conformal Prediction Image Classification +4

Paper
Code

Lower Bounds for Learning in Revealing POMDPs

no code implementations • 2 Feb 2023 • Fan Chen, Huan Wang, Caiming Xiong, Song Mei, Yu Bai

However, the fundamental limits for learning in revealing POMDPs are much less understood, with existing lower bounds being rather preliminary and having substantial gaps from the current best upper bounds.

Reinforcement Learning (RL)

Paper
Add Code

Salesforce CausalAI Library: A Fast and Scalable Framework for Causal Analysis of Time Series and Tabular Data

1 code implementation • 25 Jan 2023 • Devansh Arpit, Matthew Fernandez, Itai Feigenbaum, Weiran Yao, Chenghao Liu, Wenzhuo Yang, Paul Josel, Shelby Heinecke, Eric Hu, Huan Wang, Stephen Hoi, Caiming Xiong, Kun Zhang, Juan Carlos Niebles

Finally, we provide a user interface (UI) that allows users to perform causal analysis on data without coding.

Causal Discovery Causal Inference +2

225

Paper
Code

Model-Agnostic Hierarchical Attention for 3D Object Detection

no code implementations • 6 Jan 2023 • Manli Shu, Le Xue, Ning Yu, Roberto Martín-Martín, Juan Carlos Niebles, Caiming Xiong, ran Xu

By plugging our proposed modules into the state-of-the-art transformer-based 3D detector, we improve the previous best results on both benchmarks, with the largest improvement margin on small objects.

3D Object Detection Object +1

Paper
Add Code

LayoutDETR: Detection Transformer Is a Good Multimodal Layout Designer

1 code implementation • 19 Dec 2022 • Ning Yu, Chia-Chih Chen, Zeyuan Chen, Rui Meng, Gang Wu, Paul Josel, Juan Carlos Niebles, Caiming Xiong, ran Xu

Graphic layout designs play an essential role in visual communication.

Paper
Code

Revisiting the Gold Standard: Grounding Summarization Evaluation with Robust Human Evaluation

2 code implementations • 15 Dec 2022 • Yixin Liu, Alexander R. Fabbri, PengFei Liu, Yilun Zhao, Linyong Nan, Ruilin Han, Simeng Han, Shafiq Joty, Chien-Sheng Wu, Caiming Xiong, Dragomir Radev

Human evaluation is the foundation upon which the evaluation of both summarization systems and automatic metrics rests.

Paper
Code

ULIP: Learning a Unified Representation of Language, Images, and Point Clouds for 3D Understanding

1 code implementation • CVPR 2023 • Le Xue, Mingfei Gao, Chen Xing, Roberto Martín-Martín, Jiajun Wu, Caiming Xiong, ran Xu, Juan Carlos Niebles, Silvio Savarese

Then, ULIP learns a 3D representation space aligned with the common image-text space, using a small number of automatically synthesized triplets.

Ranked #3 on Training-free 3D Point Cloud Classification on ModelNet40 (using extra training data)

3D Architecture 3D Classification +5

354

Paper
Code

Best-$k$ Search Algorithm for Neural Text Generation

no code implementations • 22 Nov 2022 • Jiacheng Xu, Caiming Xiong, Silvio Savarese, Yingbo Zhou

We first investigate the vanilla best-first search (BFS) algorithm and then propose the Best-$k$ Search algorithm.

Question Generation Question-Generation +2

Paper
Add Code

SPE: Symmetrical Prompt Enhancement for Fact Probing

no code implementations • 14 Nov 2022 • Yiyuan Li, Tong Che, Yezhen Wang, Zhengbao Jiang, Caiming Xiong, Snigdha Chaturvedi

In this work, we propose Symmetrical Prompt Enhancement (SPE), a continuous prompt-based method for factual probing in PLMs that leverages the symmetry of the task by constructing symmetrical prompts for subject and object prediction.

Object

Paper
Add Code

Improving Factual Consistency in Summarization with Compression-Based Post-Editing

1 code implementation • 11 Nov 2022 • Alexander R. Fabbri, Prafulla Kumar Choubey, Jesse Vig, Chien-Sheng Wu, Caiming Xiong

We propose to use sentence-compression data to train the post-editing model to take a summary with extrinsic entity errors marked with special tokens and output a compressed, well-formed summary with those errors removed.

Informativeness Sentence +1

Paper
Code

Uni-Parser: Unified Semantic Parser for Question Answering on Knowledge Base and Database

no code implementations • 9 Nov 2022 • Ye Liu, Semih Yavuz, Rui Meng, Dragomir Radev, Caiming Xiong, Yingbo Zhou

Parsing natural language questions into executable logical forms is a useful and interpretable way to perform question answering on structured data such as knowledge bases (KB) or databases (DB).

Question Answering Semantic Parsing

Paper
Add Code

Discord Questions: A Computational Approach To Diversity Analysis in News Coverage

1 code implementation • 9 Nov 2022 • Philippe Laban, Chien-Sheng Wu, Lidiya Murakhovs'ka, Xiang 'Anthony' Chen, Caiming Xiong

There are many potential benefits to news readers accessing diverse sources.

Question Generation Question-Generation

Paper
Code

Model ensemble instead of prompt fusion: a sample-specific knowledge transfer method for few-shot prompt tuning

no code implementations • 23 Oct 2022 • Xiangyu Peng, Chen Xing, Prafulla Kumar Choubey, Chien-Sheng Wu, Caiming Xiong

Through this way, SESoM inherits the superior generalization of model ensemble approaches and simultaneously captures the sample-specific competence of each source prompt.

Transfer Learning

Paper
Add Code

Prompt-Tuning Can Be Much Better Than Fine-Tuning on Cross-lingual Understanding With Multilingual Language Models

2 code implementations • 22 Oct 2022 • Lifu Tu, Caiming Xiong, Yingbo Zhou

Pre-trained multilingual language models show significant performance gains for zero-shot cross-lingual model transfer on a wide range of natural language understanding (NLU) tasks.

Cross-Lingual Transfer Natural Language Understanding +3

Paper
Code

Binding Language Models in Symbolic Languages

1 code implementation • 6 Oct 2022 • Zhoujun Cheng, Tianbao Xie, Peng Shi, Chengzu Li, Rahul Nadkarni, Yushi Hu, Caiming Xiong, Dragomir Radev, Mari Ostendorf, Luke Zettlemoyer, Noah A. Smith, Tao Yu

We propose Binder, a training-free neural-symbolic framework that maps the task input to a program, which (1) allows binding a unified API of language model (LM) functionalities to a programming language (e. g., SQL, Python) to extend its grammar coverage and thus tackle more diverse questions, (2) adopts an LM as both the program parser and the underlying model called by the API during execution, and (3) requires only a few in-context exemplar annotations.

Ranked #4 on Table-based Fact Verification on TabFact

Language Modelling Semantic Parsing +1

276

Paper
Code

FOLIO: Natural Language Reasoning with First-Order Logic

1 code implementation • 2 Sep 2022 • Simeng Han, Hailey Schoelkopf, Yilun Zhao, Zhenting Qi, Martin Riddell, Luke Benson, Lucy Sun, Ekaterina Zubova, Yujie Qiao, Matthew Burtell, David Peng, Jonathan Fan, Yixin Liu, Brian Wong, Malcolm Sailor, Ansong Ni, Linyong Nan, Jungo Kasai, Tao Yu, Rui Zhang, Shafiq Joty, Alexander R. Fabbri, Wojciech Kryscinski, Xi Victoria Lin, Caiming Xiong, Dragomir Radev

We present FOLIO, a human-annotated, open-domain, and logically complex and diverse dataset for reasoning in natural language (NL), equipped with first order logic (FOL) annotations.

Language Modelling Large Language Model +1

Paper
Code

Generating Negative Samples for Sequential Recommendation

no code implementations • 7 Aug 2022 • Yongjun Chen, Jia Li, Zhiwei Liu, Nitish Shirish Keskar, Huan Wang, Julian McAuley, Caiming Xiong

Due to the dynamics of users' interests and model updates during training, considering randomly sampled items from a user's non-interacted item set as negatives can be uninformative.

Sequential Recommendation

Paper
Add Code

BigIssue: A Realistic Bug Localization Benchmark

no code implementations • 21 Jul 2022 • Paul Kassianik, Erik Nijkamp, Bo Pang, Yingbo Zhou, Caiming Xiong

As machine learning tools progress, the inevitable question arises: How can machine learning help us write better code?

BIG-bench Machine Learning Program Repair

Paper
Add Code

Policy Optimization for Markov Games: Unified Framework and Faster Convergence

no code implementations • 6 Jun 2022 • Runyu Zhang, Qinghua Liu, Huan Wang, Caiming Xiong, Na Li, Yu Bai

Next, we show that this framework instantiated with the Optimistic Follow-The-Regularized-Leader (OFTRL) algorithm at each state (and smooth value updates) can find an $\mathcal{\widetilde{O}}(T^{-5/6})$ approximate NE in $T$ iterations, and a similar algorithm with slightly modified value update rule achieves a faster $\mathcal{\widetilde{O}}(T^{-1})$ convergence rate.

Multi-agent Reinforcement Learning

Paper
Add Code

MACE: An Efficient Model-Agnostic Framework for Counterfactual Explanation

1 code implementation • 31 May 2022 • Wenzhuo Yang, Jia Li, Caiming Xiong, Steven C. H. Hoi

Counterfactual explanation is an important Explainable AI technique to explain machine learning predictions.

BIG-bench Machine Learning counterfactual +1

805

Paper
Code

Modeling Multi-hop Question Answering as Single Sequence Prediction

no code implementations • ACL 2022 • Semih Yavuz, Kazuma Hashimoto, Yingbo Zhou, Nitish Shirish Keskar, Caiming Xiong

Fusion-in-decoder (Fid) (Izacard and Grave, 2020) is a generative question answering (QA) model that leverages passage retrieval with a pre-trained transformer and pushed the state of the art on single-hop QA.

Answer Generation Generative Question Answering +3

Paper
Add Code

OneAligner: Zero-shot Cross-lingual Transfer with One Rich-Resource Language Pair for Low-Resource Sentence Retrieval

no code implementations • Findings (ACL) 2022 • Tong Niu, Kazuma Hashimoto, Yingbo Zhou, Caiming Xiong

When finetuned on a single rich-resource language pair, be it English-centered or not, our model is able to match the performance of the ones finetuned on all language pairs under the same data budget with less than 2. 0 points decrease in accuracy.

Machine Translation Retrieval +2

Paper
Add Code

Near-Negative Distinction: Giving a Second Life to Human Evaluation Datasets

1 code implementation • 13 May 2022 • Philippe Laban, Chien-Sheng Wu, Wenhao Liu, Caiming Xiong

Precisely assessing the progress in natural language generation (NLG) tasks is challenging, and human evaluation to establish a preference in a model's output over another is often necessary.

nlg evaluation Question Answering +3

Paper
Code

Quiz Design Task: Helping Teachers Create Quizzes with Automated Question Generation

no code implementations • Findings (NAACL) 2022 • Philippe Laban, Chien-Sheng Wu, Lidiya Murakhovs'ka, Wenhao Liu, Caiming Xiong

Question generation (QGen) models are often evaluated with standardized NLG metrics that are based on n-gram overlap.

Question Generation Question-Generation +1

Paper
Add Code

Use All The Labels: A Hierarchical Multi-Label Contrastive Learning Framework

1 code implementation • CVPR 2022 • Shu Zhang, ran Xu, Caiming Xiong, Chetan Ramaiah

Current contrastive learning frameworks focus on leveraging a single supervisory signal to learn representations, which limits the efficacy on unseen data and downstream tasks.

Contrastive Learning Representation Learning

141

Paper
Code

A Generative Language Model for Few-shot Aspect-Based Sentiment Analysis

1 code implementation • Findings (NAACL) 2022 • Ehsan Hosseini-Asl, Wenhao Liu, Caiming Xiong

Our evaluation results on the single-task polarity prediction show that our approach outperforms the previous state-of-the-art (based on BERT) on average performance by a large margins in few-shot and full-shot settings.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +4

Paper
Code

ELECRec: Training Sequential Recommenders as Discriminators

1 code implementation • 5 Apr 2022 • Yongjun Chen, Jia Li, Caiming Xiong

A generator, as an auxiliary model, is trained jointly with the discriminator to sample plausible alternative next items and will be thrown out after training.

Sequential Recommendation

Paper
Code

CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis

5 code implementations • 25 Mar 2022 • Erik Nijkamp, Bo Pang, Hiroaki Hayashi, Lifu Tu, Huan Wang, Yingbo Zhou, Silvio Savarese, Caiming Xiong

To democratize this, we train and release a family of large language models up to 16. 1B parameters, called CODEGEN, on natural language and programming language data, and open source the training library JAXFORMER.

Ranked #79 on Code Generation on HumanEval

Code Generation Language Modelling +2

11,809

Paper
Code

Improving Contrastive Learning with Model Augmentation

1 code implementation • 25 Mar 2022 • Zhiwei Liu, Yongjun Chen, Jia Li, Man Luo, Philip S. Yu, Caiming Xiong

However, existing methods all construct views by adopting augmentation from data perspectives, while we argue that 1) optimal data augmentation methods are hard to devise, 2) data augmentation methods destroy sequential correlations, and 3) data augmentation fails to incorporate comprehensive self-supervised signals.

Contrastive Learning Data Augmentation +2

Paper
Code

Converse: A Tree-Based Modular Task-Oriented Dialogue System

1 code implementation • 23 Mar 2022 • Tian Xie, Xinyi Yang, Angela S. Lin, Feihong Wu, Kazuma Hashimoto, Jin Qu, Young Mo Kang, Wenpeng Yin, Huan Wang, Semih Yavuz, Gang Wu, Michael Jones, Richard Socher, Yingbo Zhou, Wenhao Liu, Caiming Xiong

At the core of the struggle is the need to script every single turn of interactions between the bot and the human user.

Dialogue Management Management +1

128

Paper
Code

ConTinTin: Continual Learning from Task Instructions

no code implementations • ACL 2022 • Wenpeng Yin, Jia Li, Caiming Xiong

This work defines a new learning paradigm ConTinTin (Continual Learning from Task Instructions), in which a system should learn a sequence of new tasks one by one, each task is explained by a piece of textual instruction.

Continual Learning

Paper
Add Code

Long Document Summarization with Top-down and Bottom-up Inference

no code implementations • 15 Mar 2022 • Bo Pang, Erik Nijkamp, Wojciech Kryściński, Silvio Savarese, Yingbo Zhou, Caiming Xiong

Critical to the success of a summarization model is the faithful inference of latent representations of words or tokens in the source documents.

Ranked #1 on Text Summarization on Pubmed

Paper
Add Code

Structure Extraction in Task-Oriented Dialogues with Slot Clustering

2 code implementations • 28 Feb 2022 • Liang Qiu, Chien-Sheng Wu, Wenhao Liu, Caiming Xiong

Extracting structure information from dialogue data can help us better understand user and system behaviors.

Clustering Data Augmentation +1

Paper
Code

Efficient and Differentiable Conformal Prediction with General Function Classes

1 code implementation • ICLR 2022 • Yu Bai, Song Mei, Huan Wang, Yingbo Zhou, Caiming Xiong

Experiments show that our algorithm is able to learn valid prediction sets and improve the efficiency significantly over existing approaches in several applications such as prediction intervals with improved length, minimum-volume prediction sets for multi-output regression, and label prediction sets for image classification.

Conformal Prediction Image Classification +2

Paper
Code

Intent Contrastive Learning for Sequential Recommendation

1 code implementation • 5 Feb 2022 • Yongjun Chen, Zhiwei Liu, Jia Li, Julian McAuley, Caiming Xiong

Specifically, we introduce a latent variable to represent users' intents and learn the distribution function of the latent variable via clustering.

Contrastive Learning Model Optimization +3

Paper
Code

BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

6 code implementations • 28 Jan 2022 • Junnan Li, Dongxu Li, Caiming Xiong, Steven Hoi

Furthermore, performance improvement has been largely achieved by scaling up the dataset with noisy image-text pairs collected from the web, which is a suboptimal source of supervision.

Ranked #3 on Open Vocabulary Attribute Detection on OVAD-Box benchmark (using extra training data)

Image Captioning Image-text matching +5

124,889

Paper
Code

UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models

1 code implementation • 16 Jan 2022 • Tianbao Xie, Chen Henry Wu, Peng Shi, Ruiqi Zhong, Torsten Scholak, Michihiro Yasunaga, Chien-Sheng Wu, Ming Zhong, Pengcheng Yin, Sida I. Wang, Victor Zhong, Bailin Wang, Chengzu Li, Connor Boyle, Ansong Ni, Ziyu Yao, Dragomir Radev, Caiming Xiong, Lingpeng Kong, Rui Zhang, Noah A. Smith, Luke Zettlemoyer, Tao Yu

Structured knowledge grounding (SKG) leverages structured knowledge to complete user requests, such as semantic parsing over databases and question answering over knowledge bases.

Ranked #1 on Task-Oriented Dialogue Systems on KVRET

Few-Shot Learning Question Answering +3

530

Paper
Code

RGRecSys: A Toolkit for Robustness Evaluation of Recommender Systems

1 code implementation • 12 Jan 2022 • Zohreh Ovaisi, Shelby Heinecke, Jia Li, Yongfeng Zhang, Elena Zheleva, Caiming Xiong

Robust machine learning is an increasingly important topic that focuses on developing models resilient to various forms of imperfect data.

Recommendation Systems

Paper
Code

QAFactEval: Improved QA-Based Factual Consistency Evaluation for Summarization

1 code implementation • NAACL 2022 • Alexander R. Fabbri, Chien-Sheng Wu, Wenhao Liu, Caiming Xiong

Factual consistency is an essential quality of text summarization models in practical settings.

Question Answering Question Generation +2

Paper
Code

Value Retrieval with Arbitrary Queries for Form-like Documents

1 code implementation • 15 Dec 2021 • Mingfei Gao, Le Xue, Chetan Ramaiah, Chen Xing, ran Xu, Caiming Xiong

Unlike previous methods that only address a fixed set of field items, our method predicts target value for an arbitrary query based on the understanding of the layout and semantics of a form.

document understanding Language Modelling +1

Paper
Code

Combining Data-driven Supervision with Human-in-the-loop Feedback for Entity Resolution

no code implementations • 20 Nov 2021 • Wenpeng Yin, Shelby Heinecke, Jia Li, Nitish Shirish Keskar, Michael Jones, Shouzhong Shi, Stanislav Georgiev, Kurt Milich, Joseph Esposito, Caiming Xiong

The distribution gap between training datasets and data encountered in production is well acknowledged.

Entity Resolution

Paper
Add Code

Open Vocabulary Object Detection with Pseudo Bounding-Box Labels

1 code implementation • 18 Nov 2021 • Mingfei Gao, Chen Xing, Juan Carlos Niebles, Junnan Li, ran Xu, Wenhao Liu, Caiming Xiong

To enlarge the set of base classes, we propose a method to automatically generate pseudo bounding-box annotations of diverse objects from large-scale image-caption pairs.

Object object-detection +1

Paper
Code

Dense Hierarchical Retrieval for Open-Domain Question Answering

1 code implementation • Findings (EMNLP) 2021 • Ye Liu, Kazuma Hashimoto, Yingbo Zhou, Semih Yavuz, Caiming Xiong, Philip S. Yu

In this work, we propose Dense Hierarchical Retrieval (DHR), a hierarchical framework that can generate accurate dense representations of passages by utilizing both macroscopic semantics in the document and microscopic semantics specific to each passage.

Open-Domain Question Answering Retrieval +1

Paper
Code

Ensemble of Averages: Improving Model Selection and Boosting Performance in Domain Generalization

1 code implementation • 21 Oct 2021 • Devansh Arpit, Huan Wang, Yingbo Zhou, Caiming Xiong

We first show that this chaotic behavior exists even along the training optimization trajectory of a single model, and propose a simple model averaging protocol that both significantly boosts domain generalization and diminishes the impact of stochasticity by improving the rank correlation between the in-domain validation accuracy and out-domain test accuracy, which is crucial for reliable early stopping.

Ranked #4 on Domain Generalization on TerraIncognita

Domain Generalization Model Selection

Paper
Code

Learning Rich Nearest Neighbor Representations from Self-supervised Ensembles

no code implementations • 19 Oct 2021 • Bram Wallace, Devansh Arpit, Huan Wang, Caiming Xiong

Pretraining convolutional neural networks via self-supervision, and applying them in transfer learning, is an incredibly fast-growing field that is rapidly and iteratively improving performance across practically all image domains.

Transfer Learning

Paper
Add Code

Improving Tail-Class Representation with Centroid Contrastive Learning

no code implementations • 19 Oct 2021 • Anthony Meng Huat Tiong, Junnan Li, Guosheng Lin, Boyang Li, Caiming Xiong, Steven C. H. Hoi

ICCL interpolates two images from a class-agnostic sampler and a class-aware sampler, and trains the model such that the representation of the interpolative image can be used to retrieve the centroids for both source classes.

Ranked #22 on Long-tail Learning on CIFAR-10-LT (ρ=10)

Contrastive Learning Image Classification +2

Paper
Add Code

Momentum Contrastive Autoencoder: Using Contrastive Learning for Latent Space Distribution Matching in WAE

no code implementations • 19 Oct 2021 • Devansh Arpit, Aadyot Bhatnagar, Huan Wang, Caiming Xiong

Wasserstein autoencoder (WAE) shows that matching two distributions is equivalent to minimizing a simple autoencoder (AE) loss under the constraint that the latent space of this AE matches a pre-specified prior distribution.

Contrastive Learning Representation Learning

Paper
Add Code

MixQG: Neural Question Generation with Mixed Answer Types

1 code implementation • Findings (NAACL) 2022 • Lidiya Murakhovs'ka, Chien-Sheng Wu, Philippe Laban, Tong Niu, Wenhao Liu, Caiming Xiong

Asking good questions is an essential ability for both human and machine intelligence.

Multiple-choice Question Answering +2

Paper
Code

DialFact: A Benchmark for Fact-Checking in Dialogue

2 code implementations • ACL 2022 • Prakhar Gupta, Chien-Sheng Wu, Wenhao Liu, Caiming Xiong

Fact-checking is an essential tool to mitigate the spread of misinformation and disinformation.

Claim Verification Fact Checking +2

Paper
Code

Improving Gender Fairness of Pre-Trained Language Models without Catastrophic Forgetting

no code implementations • 11 Oct 2021 • Zahra Fatemi, Chen Xing, Wenhao Liu, Caiming Xiong

In this work, we empirically show that catastrophic forgetting occurs in such methods by evaluating them with general NLP tasks in GLUE.

coreference-resolution Fairness

Paper
Add Code

Robustness Evaluation of Transformer-based Form Field Extractors via Form Attacks

1 code implementation • 8 Oct 2021 • Le Xue, Mingfei Gao, Zeyuan Chen, Caiming Xiong, ran Xu

We propose a novel framework to evaluate the robustness of transformer-based form field extraction methods via form attacks.

Optical Character Recognition (OCR)

4,299

Paper
Code

Field Extraction from Forms with Unlabeled Data

2 code implementations • SpaNLP (ACL) 2022 • Mingfei Gao, Zeyuan Chen, Nikhil Naik, Kazuma Hashimoto, Caiming Xiong, ran Xu

We propose a novel framework to conduct field extraction from forms with unlabeled data.

Pseudo Label

Paper
Code

Self-supervised Learning for Sequential Recommendation with Model Augmentation

no code implementations • 29 Sep 2021 • Zhiwei Liu, Yongjun Chen, Jia Li, Man Luo, Philip S. Yu, Caiming Xiong

Contrastive Learning Data Augmentation +2

Paper
Add Code

Long Document Summarization with Top-Down and Bottom-Up Representation Inference

no code implementations • 29 Sep 2021 • Bo Pang, Erik Nijkamp, Wojciech Maciej Kryscinski, Silvio Savarese, Yingbo Zhou, Caiming Xiong

Critical to the success of a summarization model is the faithful inference of latent representations of words or tokens in the source documents.

Document Summarization

Paper
Add Code

Modeling Dynamic Attributes for Next Basket Recommendation

no code implementations • 23 Sep 2021 • Yongjun Chen, Jia Li, Chenghao Liu, Chenxi Li, Markus Anderle, Julian McAuley, Caiming Xiong

However, properly integrating them into user interest models is challenging since attribute dynamics can be diverse such as time-interval aware, periodic patterns (etc.

Attribute Next-basket recommendation

Paper
Add Code

Merlion: A Machine Learning Library for Time Series

2 code implementations • 20 Sep 2021 • Aadyot Bhatnagar, Paul Kassianik, Chenghao Liu, Tian Lan, Wenzhuo Yang, Rowan Cassius, Doyen Sahoo, Devansh Arpit, Sri Subramanian, Gerald Woo, Amrita Saha, Arun Kumar Jagota, Gokulakrishnan Gopalakrishnan, Manpreet Singh, K C Krithika, Sukumar Maddineni, Daeki Cho, Bo Zong, Yingbo Zhou, Caiming Xiong, Silvio Savarese, Steven Hoi, Huan Wang

We introduce Merlion, an open-source machine learning library for time series.

Anomaly Detection BIG-bench Machine Learning +2

3,259

Paper
Code

RnG-KBQA: Generation Augmented Iterative Ranking for Knowledge Base Question Answering

1 code implementation • ACL 2022 • Xi Ye, Semih Yavuz, Kazuma Hashimoto, Yingbo Zhou, Caiming Xiong

We present RnG-KBQA, a Rank-and-Generate approach for KBQA, which remedies the coverage issue with a generation model while preserving a strong generalization capability.

Entity Linking Knowledge Base Question Answering +1

105

Paper
Code

Contrastive Self-supervised Sequential Recommendation with Robust Augmentation

1 code implementation • 14 Aug 2021 • Zhiwei Liu, Yongjun Chen, Jia Li, Philip S. Yu, Julian McAuley, Caiming Xiong

In this paper, we investigate the application of contrastive Self-Supervised Learning (SSL) to the sequential recommendation, as a way to alleviate some of these issues.

Contrastive Learning Self-Supervised Learning +1

Paper
Code

Align before Fuse: Vision and Language Representation Learning with Momentum Distillation

5 code implementations • NeurIPS 2021 • Junnan Li, Ramprasaath R. Selvaraju, Akhilesh Deepak Gotmare, Shafiq Joty, Caiming Xiong, Steven Hoi

Most existing methods employ a transformer-based multimodal encoder to jointly model visual tokens (region-based image features) and word tokens.

Ranked #5 on Open Vocabulary Attribute Detection on OVAD-Box benchmark (using extra training data)

Grounded language learning Image-text matching +8

8,722

Paper
Code

A Theory-Driven Self-Labeling Refinement Method for Contrastive Representation Learning

no code implementations • NeurIPS 2021 • Pan Zhou, Caiming Xiong, Xiao-Tong Yuan, Steven Hoi

Although intuitive, such a native label assignment strategy cannot reveal the underlying semantic similarity between a query and its positives and negatives, and impairs performance, since some negatives are semantically similar to the query or even share the same semantic class as the query.

Contrastive Learning Representation Learning +2

Paper
Add Code

DocNLI: A Large-scale Dataset for Document-level Natural Language Inference

1 code implementation • Findings (ACL) 2021 • Wenpeng Yin, Dragomir Radev, Caiming Xiong

It has been studied intensively in the past few years thanks to the availability of large-scale labeled datasets.

Natural Language Inference Question Answering +2

Paper
Code

Learning to Play General-Sum Games Against Multiple Boundedly Rational Agents

1 code implementation • 10 Jun 2021 • Eric Zhao, Alexander R. Trott, Caiming Xiong, Stephan Zheng

We study the problem of training a principal in a multi-agent general-sum game using reinforcement learning (RL).

Decision Making Multi-agent Reinforcement Learning +1

Paper
Code

Understanding the Under-Coverage Bias in Uncertainty Estimation

no code implementations • NeurIPS 2021 • Yu Bai, Song Mei, Huan Wang, Caiming Xiong

Estimating the data uncertainty in regression tasks is often done by learning a quantile function or a prediction interval of the true label conditioned on the input.

regression

Paper
Add Code

Policy Finetuning: Bridging Sample-Efficient Offline and Online Reinforcement Learning

no code implementations • NeurIPS 2021 • Tengyang Xie, Nan Jiang, Huan Wang, Caiming Xiong, Yu Bai

This offline result is the first that matches the sample complexity lower bound in this setting, and resolves a recent open question in offline RL.

Offline RL Open-Ended Question Answering +2

Paper
Add Code

Are Pretrained Transformers Robust in Intent Classification? A Missing Ingredient in Evaluation of Out-of-Scope Intent Detection

1 code implementation • 8 Jun 2021 • JianGuo Zhang, Kazuma Hashimoto, Yao Wan, Zhiwei Liu, Ye Liu, Caiming Xiong, Philip S. Yu

Pre-trained Transformer-based models were reported to be robust in intent classification.

intent-classification Intent Classification +2

123

Paper
Code

Evaluating State-of-the-Art Classification Models Against Bayes Optimality

1 code implementation • NeurIPS 2021 • Ryan Theisen, Huan Wang, Lav R. Varshney, Caiming Xiong, Richard Socher

Moreover, we show that by varying the temperature of the learned flow models, we can generate synthetic datasets that closely resemble standard benchmark datasets, but with almost any desired Bayes error.

Paper
Code

Unsupervised Out-of-Domain Detection via Pre-trained Transformers

1 code implementation • ACL 2021 • Keyang Xu, Tongzheng Ren, Shikun Zhang, Yihao Feng, Caiming Xiong

Deployed real-world machine learning applications are often subject to uncontrolled and even potentially malicious inputs.

Paper
Code

SCRIPT: Self-Critic PreTraining of Transformers

no code implementations • NAACL 2021 • Erik Nijkamp, Bo Pang, Ying Nian Wu, Caiming Xiong

We introduce Self-CRItic Pretraining Transformers (SCRIPT) for representation learning of text.

Language Modelling Masked Language Modeling +1

Paper
Add Code

Controllable Abstractive Dialogue Summarization with Sketch Supervision

1 code implementation • Findings (ACL) 2021 • Chien-Sheng Wu, Linqing Liu, Wenhao Liu, Pontus Stenetorp, Caiming Xiong

In this paper, we aim to improve abstractive dialogue summarization quality and, at the same time, enable granularity control.

Abstractive Dialogue Summarization

Paper
Code

BookSum: A Collection of Datasets for Long-form Narrative Summarization

2 code implementations • 18 May 2021 • Wojciech Kryściński, Nazneen Rajani, Divyansh Agarwal, Caiming Xiong, Dragomir Radev

The majority of available text summarization datasets include short-form source documents that lack long-range causal and temporal dependencies, and often contain strong layout and stylistic biases.

Abstractive Text Summarization

173

Paper
Code

QAConv: Question Answering on Informative Conversations

1 code implementation • ACL 2022 • Chien-Sheng Wu, Andrea Madotto, Wenhao Liu, Pascale Fung, Caiming Xiong

This paper introduces QAConv, a new question answering (QA) dataset that uses conversations as a knowledge source.

Question Answering

Paper
Code

Pseudo Siamese Network for Few-shot Intent Generation

no code implementations • 3 May 2021 • Congying Xia, Caiming Xiong, Philip Yu

PSN consists of two identical subnetworks with the same structure but different weights: an action network and an object network.

Intent Detection Object +1

Paper
Add Code

Learning to Synthesize Data for Semantic Parsing

1 code implementation • NAACL 2021 • Bailin Wang, Wenpeng Yin, Xi Victoria Lin, Caiming Xiong

Moreover, explicitly modeling compositions using PCFG leads to a better exploration of unseen programs, thus generate more diverse data.

Domain Generalization Semantic Parsing +3

Paper
Code

FeTaQA: Free-form Table Question Answering

1 code implementation • 1 Apr 2021 • Linyong Nan, Chiachun Hsieh, Ziming Mao, Xi Victoria Lin, Neha Verma, Rui Zhang, Wojciech Kryściński, Nick Schoelkopf, Riley Kong, Xiangru Tang, Murori Mutuma, Ben Rosand, Isabel Trindade, Renusree Bandaru, Jacob Cunningham, Caiming Xiong, Dragomir Radev

Existing table question answering datasets contain abundant factual questions that primarily evaluate the query and schema comprehension capability of a system, but they fail to include questions that require complex reasoning and integration of information due to the constraint of the associated short-form answers.

Question Answering Retrieval +2

Paper
Code

Causal-aware Safe Policy Improvement for Task-oriented dialogue

1 code implementation • 10 Mar 2021 • Govardana Sachithanandam Ramachandran, Kazuma Hashimoto, Caiming Xiong

This method gives guarantees on dialogue policy's performance and also learns to shape rewards according to intentions behind human responses, rather than just mimicking demonstration data; this couple with batch-RL helps overall with sample efficiency of the framework.

Dialogue Management Management +1

Paper
Code

Structured Scene Memory for Vision-Language Navigation

1 code implementation • CVPR 2021 • Hanqing Wang, Wenguan Wang, Wei Liang, Caiming Xiong, Jianbing Shen

Recently, numerous algorithms have been developed to tackle the problem of vision-language navigation (VLN), i. e., entailing an agent to navigate 3D environments through following linguistic instructions.

Decision Making Navigate +1

Paper
Code

Sample-Efficient Learning of Stackelberg Equilibria in General-Sum Games

no code implementations • NeurIPS 2021 • Yu Bai, Chi Jin, Huan Wang, Caiming Xiong

Real world applications such as economics and policy making often involve solving multi-agent games with two unique features: (1) The agents are inherently asymmetric and partitioned into leaders and followers; (2) The agents have different reward functions, thus the game is general-sum.

Paper
Add Code

Local Calibration: Metrics and Recalibration

no code implementations • 22 Feb 2021 • Rachel Luo, Aadyot Bhatnagar, Yu Bai, Shengjia Zhao, Huan Wang, Caiming Xiong, Silvio Savarese, Stefano Ermon, Edward Schmerling, Marco Pavone

In this work, we propose the local calibration error (LCE) to span the gap between average and individual reliability.

Decision Making Fairness

Paper
Add Code

Don't Just Blame Over-parametrization for Over-confidence: Theoretical Analysis of Calibration in Binary Classification

no code implementations • 15 Feb 2021 • Yu Bai, Song Mei, Huan Wang, Caiming Xiong

Modern machine learning models with high accuracy are often miscalibrated -- the predicted top probability does not reflect the actual accuracy, and tends to be over-confident.

Binary Classification

Paper
Add Code

Joint Energy-based Model Training for Better Calibrated Natural Language Understanding Models

1 code implementation • EACL 2021 • Tianxing He, Bryan McCann, Caiming Xiong, Ehsan Hosseini-Asl

In this work, we explore joint energy-based model (EBM) training during the finetuning of pretrained text encoders (e. g., Roberta) for natural language understanding (NLU) tasks.

Language Modelling Natural Language Understanding

Paper
Code

Robustness Gym: Unifying the NLP Evaluation Landscape

2 code implementations • NAACL 2021 • Karan Goel, Nazneen Rajani, Jesse Vig, Samson Tan, Jason Wu, Stephan Zheng, Caiming Xiong, Mohit Bansal, Christopher Ré

Despite impressive performance on standard benchmarks, deep neural networks are often brittle when deployed in real-world systems.

Entity Linking

627

Paper
Code

Neural Bayes: A Generic Parameterization Method for Unsupervised Learning

no code implementations • 1 Jan 2021 • Devansh Arpit, Huan Wang, Caiming Xiong, Richard Socher, Yoshua Bengio

Disjoint Manifold Separation: Neural Bayes allows us to formulate an objective which can optimally label samples from disjoint manifolds present in the support of a continuous distribution.

Clustering Representation Learning

Paper
Add Code

Noise-Robust Contrastive Learning

no code implementations • 1 Jan 2021 • Junnan Li, Caiming Xiong, Steven Hoi

In contrast to most existing methods, we combat noise by learning robust representation.

Contrastive Learning

Paper
Add Code

Improved Uncertainty Post-Calibration via Rank Preserving Transforms

no code implementations • 1 Jan 2021 • Yu Bai, Tengyu Ma, Huan Wang, Caiming Xiong

In this paper, we propose Neural Rank Preserving Transforms (NRPT), a new post-calibration method that adjusts the output probabilities of a trained classifier using a calibrator of higher capacity, while maintaining its prediction accuracy.

text-classification Text Classification

Paper
Add Code

Momentum Contrastive Autoencoder

no code implementations • 1 Jan 2021 • Devansh Arpit, Aadyot Bhatnagar, Huan Wang, Caiming Xiong

Quantitatively, we show that our algorithm achieves a new state-of-the-art FID of 54. 36 on CIFAR-10, and performs competitively with existing models on CelebA in terms of FID score.

Contrastive Learning Representation Learning

Paper
Add Code

Learning From Noisy Data With Robust Representation Learning

1 code implementation • ICCV 2021 • Junnan Li, Caiming Xiong, Steven C.H. Hoi

In contrast to most existing methods, we combat noise by learning robust representation.

Contrastive Learning Representation Learning

Paper
Code

ERMAS: Learning Policies Robust to Reality Gaps in Multi-Agent Simulations

no code implementations • 1 Jan 2021 • Eric Zhao, Alexander R Trott, Caiming Xiong, Stephan Zheng

Policies for real-world multi-agent problems, such as optimal taxation, can be learned in multi-agent simulations with AI agents that emulate humans.

Meta-Learning

Paper
Add Code

CorDial: Coarse-to-fine Abstractive Dialogue Summarization with Controllable Granularity

no code implementations • 1 Jan 2021 • Chien-Sheng Wu, Linqing Liu, Wenhao Liu, Pontus Stenetorp, Caiming Xiong

2) A simple strategy to control the granularity of the final summary.

Abstractive Dialogue Summarization

Paper
Add Code

FastIF: Scalable Influence Functions for Efficient Model Interpretation and Debugging

1 code implementation • EMNLP 2021 • Han Guo, Nazneen Fatema Rajani, Peter Hase, Mohit Bansal, Caiming Xiong

With the availability of the fast influence functions, we demonstrate their usefulness in four applications.

Data Augmentation

Paper
Code

Catastrophic Fisher Explosion: Early Phase Fisher Matrix Impacts Generalization

no code implementations • 28 Dec 2020 • Stanislaw Jastrzebski, Devansh Arpit, Oliver Astrand, Giancarlo Kerg, Huan Wang, Caiming Xiong, Richard Socher, Kyunghyun Cho, Krzysztof Geras

The early phase of training a deep neural network has a dramatic effect on the local curvature of the loss function.

Memorization

Paper
Add Code

Bridging Textual and Tabular Data for Cross-Domain Text-to-SQL Semantic Parsing

2 code implementations • Findings of the Association for Computational Linguistics 2020 • Xi Victoria Lin, Richard Socher, Caiming Xiong

We present BRIDGE, a powerful sequential architecture for modeling dependencies between natural language questions and relational databases in cross-DB semantic parsing.

Deep Attention Semantic Parsing +1

214

Paper
Code

Learning from Mistakes: Using Mis-predictions as Harm Alerts in Language Pre-Training

no code implementations • 16 Dec 2020 • Chen Xing, Wenhao Liu, Caiming Xiong

According to recent studies and our empirical observations, one possible reason is that some easy-to-fit patterns in the training data, such as frequently co-occurring word combinations, dominate and harm pre-training, making it hard for the model to fit more complex information.

Sentence

Paper
Add Code

CTRLsum: Towards Generic Controllable Text Summarization

1 code implementation • 8 Dec 2020 • Junxian He, Wojciech Kryściński, Bryan McCann, Nazneen Rajani, Caiming Xiong

Our approach enables users to control multiple aspects of generated summaries by interacting with the summarization system through textual input in the form of a set of keywords or descriptive prompts.

Descriptive Reading Comprehension +1

145

Paper
Code

GAEA: Graph Augmentation for Equitable Access via Reinforcement Learning

1 code implementation • 7 Dec 2020 • Govardana Sachithanandam Ramachandran, Ivan Brugere, Lav R. Varshney, Caiming Xiong

Similarly, social networks within universities and organizations may enable certain groups to more easily access people with valuable information or influence.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Adapt-and-Adjust: Overcoming the Long-Tail Problem of Multilingual Speech Recognition

no code implementations • 3 Dec 2020 • Genta Indra Winata, Guangsen Wang, Caiming Xiong, Steven Hoi

One crucial challenge of real-world multilingual speech recognition is the long-tailed distribution problem, where some resource-rich languages like English have abundant training data, but a long tail of low-resource languages have varying amounts of limited training data.

Language Modelling Multi-Task Learning +2

Paper
Add Code

CoMatch: Semi-supervised Learning with Contrastive Graph Regularization

3 code implementations • ICCV 2021 • Junnan Li, Caiming Xiong, Steven Hoi

CoMatch jointly learns two representations of the training data, their class probabilities and low-dimensional embeddings.

Ranked #2 on Semi-Supervised Image Classification on CIFAR-10, 20 Labels

Contrastive Learning Representation Learning +2

121

Paper
Code

What's New? Summarizing Contributions in Scientific Literature

no code implementations • 6 Nov 2020 • Hiroaki Hayashi, Wojciech Kryściński, Bryan McCann, Nazneen Rajani, Caiming Xiong

To overcome this problem, we introduce a new task of disentangled paper summarization, which seeks to generate separate summaries for the paper contributions and the context of the work, making it easier to identify the key findings shared in articles.

Disentanglement

Paper
Add Code

Improving Limited Labeled Dialogue State Tracking with Self-Supervision

no code implementations • Findings of the Association for Computational Linguistics 2020 • Chien-Sheng Wu, Steven Hoi, Caiming Xiong

We present and investigate two self-supervised objectives: preserving latent consistency and modeling conversational behavior.

Dialogue State Tracking

Paper
Add Code

Probing Task-Oriented Dialogue Representation from Language Models

no code implementations • EMNLP 2020 • Chien-Sheng Wu, Caiming Xiong

This paper investigates pre-trained language models to find out which model intrinsically carries the most informative representation for task-oriented dialogue tasks.

Clustering Language Modelling +1

Paper
Add Code

Discriminative Nearest Neighbor Few-Shot Intent Detection by Transferring Natural Language Inference

1 code implementation • EMNLP 2020 • Jian-Guo Zhang, Kazuma Hashimoto, Wenhao Liu, Chien-Sheng Wu, Yao Wan, Philip S. Yu, Richard Socher, Caiming Xiong

Intent detection is one of the core components of goal-oriented dialog systems, and detecting out-of-scope (OOS) intents is also a practically important skill.

Few-Shot Learning Goal-Oriented Dialog +3

Paper
Code

CoCo: Controllable Counterfactuals for Evaluating Dialogue State Trackers

2 code implementations • ICLR 2021 • Shiyang Li, Semih Yavuz, Kazuma Hashimoto, Jia Li, Tong Niu, Nazneen Rajani, Xifeng Yan, Yingbo Zhou, Caiming Xiong

Dialogue state trackers have made significant progress on benchmark datasets, but their generalization capability to novel and realistic scenarios beyond the held-out conversations is less understood.

Ranked #2 on Multi-domain Dialogue State Tracking on MULTIWOZ 2.1 (using extra training data)

counterfactual Dialogue State Tracking +1

Paper
Code

Unsupervised Paraphrasing with Pretrained Language Models

no code implementations • EMNLP 2021 • Tong Niu, Semih Yavuz, Yingbo Zhou, Nitish Shirish Keskar, Huan Wang, Caiming Xiong

To enforce a surface form dissimilar from the input, whenever the language model emits a token contained in the source sequence, DB prevents the model from outputting the subsequent source token for the next generation step.

Blocking Language Modelling +3

Paper
Add Code

Online Structured Meta-learning

no code implementations • NeurIPS 2020 • Huaxiu Yao, Yingbo Zhou, Mehrdad Mahdavi, Zhenhui Li, Richard Socher, Caiming Xiong

When a new task is encountered, it constructs a meta-knowledge pathway by either utilizing the most relevant knowledge blocks or exploring new blocks.

Meta-Learning

Paper
Add Code

Explaining and Improving Model Behavior with k Nearest Neighbor Representations

no code implementations • 18 Oct 2020 • Nazneen Fatema Rajani, Ben Krause, Wengpeng Yin, Tong Niu, Richard Socher, Caiming Xiong

Interpretability techniques in NLP have mainly focused on understanding individual predictions using attention visualization or gradient-based saliency maps over tokens.

Natural Language Inference

Paper
Add Code

How Important is the Train-Validation Split in Meta-Learning?

no code implementations • 12 Oct 2020 • Yu Bai, Minshuo Chen, Pan Zhou, Tuo Zhao, Jason D. Lee, Sham Kakade, Huan Wang, Caiming Xiong

A common practice in meta-learning is to perform a train-validation split (\emph{train-val method}) where the prior adapts to the task on one split of the data, and the resulting predictor is evaluated on another split.

Meta-Learning

Paper
Add Code

Towards Theoretically Understanding Why SGD Generalizes Better Than ADAM in Deep Learning

no code implementations • NeurIPS 2020 • Pan Zhou, Jiashi Feng, Chao Ma, Caiming Xiong, Steven Hoi, Weinan E

The result shows that (1) the escaping time of both SGD and ADAM~depends on the Radon measure of the basin positively and the heaviness of gradient noise negatively; (2) for the same basin, SGD enjoys smaller escaping time than ADAM, mainly because (a) the geometry adaptation in ADAM~via adaptively scaling each gradient coordinate well diminishes the anisotropic structure in gradient noise and results in larger Radon measure of a basin; (b) the exponential gradient average in ADAM~smooths its gradient and leads to lighter gradient noise tails than SGD.

Paper
Add Code

Representation Learning for Sequence Data with Deep Autoencoding Predictive Components

2 code implementations • ICLR 2021 • Junwen Bai, Weiran Wang, Yingbo Zhou, Caiming Xiong

We propose Deep Autoencoding Predictive Components (DAPC) -- a self-supervised representation learning method for sequence data, based on the intuition that useful representations of sequence data should exhibit a simple structure in the latent space.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Code

Universal Natural Language Processing with Limited Annotations: Try Few-shot Textual Entailment as a Start

1 code implementation • EMNLP 2020 • Wenpeng Yin, Nazneen Fatema Rajani, Dragomir Radev, Richard Socher, Caiming Xiong

We demonstrate that this framework enables a pretrained entailment model to work well on new entailment domains in a few-shot setting, and show its effectiveness as a unified solver for several downstream NLP tasks such as question answering and coreference resolution when the end-task annotations are limited.

coreference-resolution Natural Language Inference +1

Paper
Code

Discern: Discourse-Aware Entailment Reasoning Network for Conversational Machine Reading

1 code implementation • EMNLP 2020 • Yifan Gao, Chien-Sheng Wu, Jingjing Li, Shafiq Joty, Steven C. H. Hoi, Caiming Xiong, Irwin King, Michael R. Lyu

Based on the learned EDU and entailment representations, we either reply to the user our final decision "yes/no/irrelevant" of the initial question, or generate a follow-up question to inquiry more information.

Decision Making Discourse Segmentation +3

Paper
Code

GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing

1 code implementation • ICLR 2021 • Tao Yu, Chien-Sheng Wu, Xi Victoria Lin, Bailin Wang, Yi Chern Tan, Xinyi Yang, Dragomir Radev, Richard Socher, Caiming Xiong

We present GraPPa, an effective pre-training approach for table semantic parsing that learns a compositional inductive bias in the joint representations of textual and tabular data.

Ranked #8 on Semantic Parsing on spider

Inductive Bias Language Modelling +3

Paper
Code

Composed Variational Natural Language Generation for Few-shot Intents

no code implementations • Findings of the Association for Computational Linguistics 2020 • Congying Xia, Caiming Xiong, Philip Yu, Richard Socher

In this paper, we focus on generating training examples for few-shot intents in the realistic imbalanced scenario.

Intent Detection Text Generation

Paper
Add Code

MoPro: Webly Supervised Learning with Momentum Prototypes

2 code implementations • ICLR 2021 • Junnan Li, Caiming Xiong, Steven C. H. Hoi

We propose momentum prototypes (MoPro), a simple contrastive learning method that achieves online label noise correction, out-of-distribution sample removal, and representation learning.

Ranked #12 on Image Classification on OmniBenchmark (using extra training data)

Contrastive Learning Image Classification +2

Paper
Code

Photon: A Robust Cross-Domain Text-to-SQL System

no code implementations • ACL 2020 • Jichuan Zeng, Xi Victoria Lin, Caiming Xiong, Richard Socher, Michael R. Lyu, Irwin King, Steven C. H. Hoi

Natural language interfaces to databases (NLIDB) democratize end user access to relational data.

Text-To-SQL

Paper
Add Code

SummEval: Re-evaluating Summarization Evaluation

5 code implementations • 24 Jul 2020 • Alexander R. Fabbri, Wojciech Kryściński, Bryan McCann, Caiming Xiong, Richard Socher, Dragomir Radev

The scarcity of comprehensive up-to-date studies on evaluation metrics for text summarization and the lack of consensus regarding evaluation protocols continue to inhibit progress.

Text Summarization

342

Paper
Code

DART: Open-Domain Structured Data Record to Text Generation

2 code implementations • NAACL 2021 • Linyong Nan, Dragomir Radev, Rui Zhang, Amrit Rau, Abhinand Sivaprasad, Chiachun Hsieh, Xiangru Tang, Aadit Vyas, Neha Verma, Pranav Krishna, Yangxiaokang Liu, Nadia Irwanto, Jessica Pan, Faiaz Rahman, Ahmad Zaidi, Mutethia Mutuma, Yasin Tarabar, Ankit Gupta, Tao Yu, Yi Chern Tan, Xi Victoria Lin, Caiming Xiong, Richard Socher, Nazneen Fatema Rajani

Data-to-Text annotations can be a costly process, especially when dealing with tables which are the major source of structured data and contain nontrivial structures.

Domain Generalization Semantic Parsing +2

142

Paper
Code

Explicit Memory Tracker with Coarse-to-Fine Reasoning for Conversational Machine Reading

1 code implementation • ACL 2020 • Yifan Gao, Chien-Sheng Wu, Shafiq Joty, Caiming Xiong, Richard Socher, Irwin King, Michael Lyu, Steven C. H. Hoi

The goal of conversational machine reading is to answer user questions given a knowledge base text which may require asking clarification questions.

Decision Making Reading Comprehension +1

Paper
Code

Theory-Inspired Path-Regularized Differential Network Architecture Search

1 code implementation • NeurIPS 2020 • Pan Zhou, Caiming Xiong, Richard Socher, Steven C. H. Hoi

Then we propose a theory-inspired path-regularized DARTS that consists of two key modules: (i) a differential group-structured sparse binary gate introduced for each operation to avoid unfair competition among operations, and (ii) a path-depth-wise regularization used to incite search exploration for deep architectures that often converge slower than shallow ones as shown in our theory and are not well explored during the search.

Image Classification

Paper
Code

BERTology Meets Biology: Interpreting Attention in Protein Language Models

2 code implementations • ICLR 2021 • Jesse Vig, Ali Madani, Lav R. Varshney, Caiming Xiong, Richard Socher, Nazneen Fatema Rajani

Transformer architectures have proven to learn useful representations for protein classification and generation tasks.

298

Paper
Code

Towards Understanding Hierarchical Learning: Benefits of Neural Representations

no code implementations • NeurIPS 2020 • Minshuo Chen, Yu Bai, Jason D. Lee, Tuo Zhao, Huan Wang, Caiming Xiong, Richard Socher

When the trainable network is the quadratic Taylor model of a wide two-layer network, we show that neural representation can achieve improved sample complexities compared with the raw input: For learning a low-rank degree-$p$ polynomial ($p \geq 4$) in $d$ dimension, neural representation requires only $\tilde{O}(d^{\lceil p/2 \rceil})$ samples, while the best-known sample complexity upper bound for the raw input is $\tilde{O}(d^{p-1})$.

Paper
Add Code

A High-Quality Multilingual Dataset for Structured Documentation Translation

1 code implementation • WS 2019 • Kazuma Hashimoto, Raffaella Buschiazzo, James Bradbury, Teresa Marshall, Richard Socher, Caiming Xiong

We build and evaluate translation models for seven target languages from English, with several different copy mechanisms and an XML-constrained beam search.

Translation Vocal Bursts Intensity Prediction

Paper
Code

WOAD: Weakly Supervised Online Action Detection in Untrimmed Videos

no code implementations • CVPR 2021 • Mingfei Gao, Yingbo Zhou, ran Xu, Richard Socher, Caiming Xiong

Online action detection in untrimmed videos aims to identify an action as it happens, which makes it very important for real-time applications.

Ranked #5 on Online Action Detection on THUMOS'14

Action Recognition Online Action Detection

Paper
Add Code

EMT: Explicit Memory Tracker with Coarse-to-Fine Reasoning for Conversational Machine Reading

1 code implementation • 26 May 2020 • Yifan Gao, Chien-Sheng Wu, Shafiq Joty, Caiming Xiong, Richard Socher, Irwin King, Michael R. Lyu, Steven C. H. Hoi

The goal of conversational machine reading is to answer user questions given a knowledge base text which may require asking clarification questions.

Decision Making Reading Comprehension +1

Paper
Code

Prototypical Contrastive Learning of Unsupervised Representations

2 code implementations • ICLR 2021 • Junnan Li, Pan Zhou, Caiming Xiong, Steven C. H. Hoi

This paper presents Prototypical Contrastive Learning (PCL), an unsupervised representation learning method that addresses the fundamental limitations of instance-wise contrastive learning.

Ranked #5 on Contrastive Learning on imagenet-1k

Clustering Contrastive Learning +4

538

Paper
Code

Double-Hard Debias: Tailoring Word Embeddings for Gender Bias Mitigation

1 code implementation • ACL 2020 • Tianlu Wang, Xi Victoria Lin, Nazneen Fatema Rajani, Bryan McCann, Vicente Ordonez, Caiming Xiong

Word embeddings derived from human-generated corpora inherit strong gender bias which can be further amplified by downstream models.

Word Embeddings

Paper
Code

ESPRIT: Explaining Solutions to Physical Reasoning Tasks

2 code implementations • ACL 2020 • Nazneen Fatema Rajani, Rui Zhang, Yi Chern Tan, Stephan Zheng, Jeremy Weiss, Aadit Vyas, Abhijit Gupta, Caiming Xiong, Richard Socher, Dragomir Radev

Our framework learns to generate explanations of how the physical simulation will causally evolve so that an agent or a human can easily reason about a solution using those interpretable descriptions.

425

Paper
Code

VD-BERT: A Unified Vision and Dialog Transformer with BERT

1 code implementation • EMNLP 2020 • Yue Wang, Shafiq Joty, Michael R. Lyu, Irwin King, Caiming Xiong, Steven C. H. Hoi

By contrast, in this work, we propose VD-BERT, a simple yet effective framework of unified vision-dialog Transformer that leverages the pretrained BERT language models for Visual Dialog tasks.

Answer Generation Visual Dialog

Paper
Code

TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented Dialogue

1 code implementation • EMNLP 2020 • Chien-Sheng Wu, Steven Hoi, Richard Socher, Caiming Xiong

The underlying difference of linguistic patterns between general text and task-oriented dialogue makes existing pre-trained language models less useful in practice.

Dialogue State Tracking Intent Detection +3

285

Paper
Code

An investigation of phone-based subword units for end-to-end speech recognition

no code implementations • 8 Apr 2020 • Weiran Wang, Guangsen Wang, Aadyot Bhatnagar, Yingbo Zhou, Caiming Xiong, Richard Socher

For Switchboard, our phone-based BPE system achieves 6. 8\%/14. 4\% word error rate (WER) on the Switchboard/CallHome portion of the test set while joint decoding achieves 6. 3\%/13. 3\% WER.

Language Modelling speech-recognition +1

Paper
Add Code

Towards Noise-resistant Object Detection with Noisy Annotations

no code implementations • 3 Mar 2020 • Junnan Li, Caiming Xiong, Richard Socher, Steven Hoi

We address the challenging problem of training object detectors with noisy annotations, where the noise contains a mixture of label noise and bounding box noise.

Object object-detection +1

Paper
Add Code

Differentially Private Deep Learning with Smooth Sensitivity

no code implementations • 1 Mar 2020 • Lichao Sun, Yingbo Zhou, Philip S. Yu, Caiming Xiong

Ensuring the privacy of sensitive data used to train modern machine learning models is of paramount importance in many areas of practice.

Paper
Add Code

Adv-BERT: BERT is not robust on misspellings! Generating nature adversarial samples on BERT

no code implementations • 27 Feb 2020 • Lichao Sun, Kazuma Hashimoto, Wenpeng Yin, Akari Asai, Jia Li, Philip Yu, Caiming Xiong

There is an increasing amount of literature that claims the brittleness of deep neural networks in dealing with adversarial examples that are created maliciously.

Question Answering Sentence +1

Paper
Add Code

Neural Bayes: A Generic Parameterization Method for Unsupervised Representation Learning

1 code implementation • 20 Feb 2020 • Devansh Arpit, Huan Wang, Caiming Xiong, Richard Socher, Yoshua Bengio

Disjoint Manifold Labeling: Neural Bayes allows us to formulate an objective which can optimally label samples from disjoint manifolds present in the support of a continuous distribution.

Clustering Representation Learning

Paper
Code

Explore, Discover and Learn: Unsupervised Discovery of State-Covering Skills

1 code implementation • ICML 2020 • Víctor Campos, Alexander Trott, Caiming Xiong, Richard Socher, Xavier Giro-i-Nieto, Jordi Torres

We perform an extensive evaluation of skill discovery methods on controlled environments and show that EDL offers significant advantages, such as overcoming the coverage problem, reducing the dependence of learned skills on the initial state, and allowing the user to define a prior over which behaviors should be learned.

Paper
Code

Taylorized Training: Towards Better Approximation of Neural Network Training at Finite Width

no code implementations • 10 Feb 2020 • Yu Bai, Ben Krause, Huan Wang, Caiming Xiong, Richard Socher

We propose \emph{Taylorized training} as an initiative towards better understanding neural network training at finite width.

Paper
Add Code

Proposal Learning for Semi-Supervised Object Detection

no code implementations • 15 Jan 2020 • Peng Tang, Chetan Ramaiah, Yan Wang, ran Xu, Caiming Xiong

two-stage object detectors) by training on both labeled and unlabeled data.

Object object-detection +2

Paper
Add Code

Learning from Noisy Anchors for One-stage Object Detection

1 code implementation • CVPR 2020 • Hengduo Li, Zuxuan Wu, Chen Zhu, Caiming Xiong, Richard Socher, Larry S. Davis

State-of-the-art object detectors rely on regressing and classifying an extensive list of possible anchors, which are divided into positive and negative samples based on their intersection-over-union (IoU) with corresponding groundtruth objects.

Classification General Classification +3

Paper
Code

LiteEval: A Coarse-to-Fine Framework for Resource Efficient Video Recognition

no code implementations • NeurIPS 2019 • Zuxuan Wu, Caiming Xiong, Yu-Gang Jiang, Larry S. Davis

This paper presents LiteEval, a simple yet effective coarse-to-fine framework for resource efficient video recognition, suitable for both online and offline scenarios.

Video Recognition

Paper
Add Code

Learning to Retrieve Reasoning Paths over Wikipedia Graph for Question Answering

2 code implementations • ICLR 2020 • Akari Asai, Kazuma Hashimoto, Hannaneh Hajishirzi, Richard Socher, Caiming Xiong

Answering questions that require multi-hop reasoning at web-scale necessitates retrieving multiple evidence documents, one of which often has little lexical or semantic relationship to the question.

Ranked #26 on Question Answering on HotpotQA

Question Answering Retrieval

420

Paper
Code

Deep Verifier Networks: Verification of Deep Discriminative Models with Deep Generative Models

no code implementations • 18 Nov 2019 • Tong Che, Xiaofeng Liu, Site Li, Yubin Ge, Ruixiang Zhang, Caiming Xiong, Yoshua Bengio

We test the verifier network on out-of-distribution detection and adversarial example detection problems, as well as anomaly detection problems in structured prediction tasks such as image caption generation.

Anomaly Detection Autonomous Driving +4

Paper
Add Code

MKD: a Multi-Task Knowledge Distillation Approach for Pretrained Language Models

no code implementations • 9 Nov 2019 • Linqing Liu, Huan Wang, Jimmy Lin, Richard Socher, Caiming Xiong

Our approach is model agnostic and can be easily applied on different future teacher model architectures.

Knowledge Distillation Multi-Task Learning

Paper
Add Code

ERASER: A Benchmark to Evaluate Rationalized NLP Models

2 code implementations • ACL 2020 • Jay DeYoung, Sarthak Jain, Nazneen Fatema Rajani, Eric Lehman, Caiming Xiong, Richard Socher, Byron C. Wallace

We propose several metrics that aim to capture how well the rationales provided by models align with human rationales, and also how faithful these rationales are (i. e., the degree to which provided rationales influenced the corresponding predictions).

Paper
Code

Keeping Your Distance: Solving Sparse Reward Tasks Using Self-Balancing Shaped Rewards

1 code implementation • NeurIPS 2019 • Alexander Trott, Stephan Zheng, Caiming Xiong, Richard Socher

For instance, in tasks where the agent must achieve some goal state, simple distance-to-goal reward shaping often fails, as it renders learning vulnerable to local optima.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.