Search Results for author: Xifeng Yan

Found 42 papers, 19 papers with code

Finetuned Multimodal Language Models Are High-Quality Image-Text Data Filters

no code implementations • 5 Mar 2024 • Weizhi Wang, Khalil Mrini, Linjie Yang, Sateesh Kumar, Yu Tian, Xifeng Yan, Heng Wang

Our MLM filter can generalize to different models and tasks, and be used as a drop-in replacement for CLIPScore.

Paper
Add Code

Large Language Models as Zero-shot Dialogue State Tracker through Function Calling

no code implementations • 16 Feb 2024 • Zekun Li, Zhiyu Zoey Chen, Mike Ross, Patrick Huber, Seungwhan Moon, Zhaojiang Lin, Xin Luna Dong, Adithya Sagar, Xifeng Yan, Paul A. Crook

We also show that by fine-tuning on a small collection of diverse task-oriented dialogues, we can equip modestly sized models, specifically a 13B parameter LLaMA2-Chat model, with function-calling capabilities and DST performance comparable to ChatGPT while maintaining their chat capabilities.

Avg Dialogue State Tracking +1

Paper
Add Code

GPT-4V(ision) as a Generalist Evaluator for Vision-Language Tasks

no code implementations • 2 Nov 2023 • Xinlu Zhang, Yujie Lu, Weizhi Wang, An Yan, Jun Yan, Lianke Qin, Heng Wang, Xifeng Yan, William Yang Wang, Linda Ruth Petzold

Automatically evaluating vision-language tasks is challenging, especially when it comes to reflecting human judgments due to limitations in accounting for fine-grained details.

Image Generation

Paper
Add Code

Evaluating the Instruction-Following Robustness of Large Language Models to Prompt Injection

2 code implementations • 17 Aug 2023 • Zekun Li, Baolin Peng, Pengcheng He, Xifeng Yan

In this work, we establish a benchmark to evaluate the robustness of instruction-following LLMs against prompt injection attacks.

Instruction Following

Paper
Code

Augmenting Language Models with Long-Term Memory

no code implementations • NeurIPS 2023 • Weizhi Wang, Li Dong, Hao Cheng, Xiaodong Liu, Xifeng Yan, Jianfeng Gao, Furu Wei

Such a decoupled memory design can easily cache and update long-term past contexts for memory retrieval without suffering from memory staleness.

In-Context Learning Language Modelling +1

Paper
Add Code

STEPS: A Benchmark for Order Reasoning in Sequential Tasks

no code implementations • 7 Jun 2023 • Weizhi Wang, Hong Wang, Xifeng Yan

Therefore, to verify the order reasoning capability of current neural models in sequential tasks, we propose a challenging benchmark , named STEPS.

In-Context Learning

Paper
Add Code

Graph Reasoning for Question Answering with Triplet Retrieval

no code implementations • 30 May 2023 • Shiyang Li, Yifan Gao, Haoming Jiang, Qingyu Yin, Zheng Li, Xifeng Yan, Chao Zhang, Bing Yin

State-of-the-art methods often utilize entities in questions to retrieve local subgraphs, which are then fed into KG encoder, e. g. graph neural networks (GNNs), to model their local structures and integrated into language models for question answering.

Knowledge Graphs Question Answering +1

Paper
Add Code

Bot or Human? Detecting ChatGPT Imposters with A Single Question

1 code implementation • 10 May 2023 • Hong Wang, Xuan Luo, Weizhi Wang, Xifeng Yan

Large language models like GPT-4 have recently demonstrated impressive capabilities in natural language understanding and generation, enabling various applications including translation, essay writing, and chit-chatting.

Language Modelling Large Language Model +2

Paper
Code

Time Series as Images: Vision Transformer for Irregularly Sampled Time Series

1 code implementation • NeurIPS 2023 • Zekun Li, Shiyang Li, Xifeng Yan

This paper introduces a novel perspective by converting irregularly sampled time series into line graph images, then utilizing powerful pre-trained vision transformers for time series classification in the same way as image classification.

Image Classification Time Series +1

Paper
Code

Language Model Detoxification in Dialogue with Contextualized Stance Control

no code implementations • 25 Jan 2023 • Jing Qian, Xifeng Yan

To reduce the toxic degeneration in a pretrained Language Model (LM), previous work on Language Model detoxification has focused on reducing the toxicity of the generation itself (self-toxicity) without consideration of the context.

Language Modelling Response Generation

Paper
Add Code

Improving Medical Predictions by Irregular Multimodal Electronic Health Records Modeling

1 code implementation • 18 Oct 2022 • Xinlu Zhang, Shiyang Li, Zhiyu Chen, Xifeng Yan, Linda Petzold

Our method first addresses irregularity in each single modality by (1) modeling irregular time series by dynamically incorporating hand-crafted imputation embeddings into learned interpolation embeddings via a gating mechanism, and (2) casting a series of clinical note representations as multivariate irregular time series and tackling irregularity via a time attention mechanism.

Imputation Irregular Time Series +2

Paper
Code

Explanations from Large Language Models Make Small Reasoners Better

no code implementations • 13 Oct 2022 • Shiyang Li, Jianshu Chen, Yelong Shen, Zhiyu Chen, Xinlu Zhang, Zekun Li, Hong Wang, Jing Qian, Baolin Peng, Yi Mao, Wenhu Chen, Xifeng Yan

Integrating free-text explanations to in-context learning of large language models (LLM) is shown to elicit strong reasoning capabilities along with reasonable explanations.

Explanation Generation In-Context Learning +1

Paper
Add Code

Controllable Dialogue Simulation with In-Context Learning

1 code implementation • 9 Oct 2022 • Zekun Li, Wenhu Chen, Shiyang Li, Hong Wang, Jing Qian, Xifeng Yan

Experimental results on the MultiWOZ dataset demonstrate that training a model on the simulated dialogues leads to even better performance than using the same amount of human-generated dialogues under the challenging low-resource settings, with as few as 85 dialogues as a seed.

Data Augmentation In-Context Learning +2

Paper
Code

Limitations of Language Models in Arithmetic and Symbolic Induction

no code implementations • 9 Aug 2022 • Jing Qian, Hong Wang, Zekun Li, Shiyang Li, Xifeng Yan

LMs with tutor is able to deliver 100% accuracy in situations of OOD and repeating symbols, shedding new insights on the boundary of large LMs in induction.

Paper
Add Code

Visually-Augmented Language Modeling

1 code implementation • 20 May 2022 • Weizhi Wang, Li Dong, Hao Cheng, Haoyu Song, Xiaodong Liu, Xifeng Yan, Jianfeng Gao, Furu Wei

With the visually-augmented context, VaLM uses a visual knowledge fusion layer to enable multimodal grounded language modeling by attending to both text context and visual knowledge in images.

Image Retrieval Language Modelling +1

Paper
Code

Task-adaptive Pre-training and Self-training are Complementary for Natural Language Understanding

no code implementations • Findings (EMNLP) 2021 • Shiyang Li, Semih Yavuz, Wenhu Chen, Xifeng Yan

Task-adaptive pre-training (TAPT) and Self-training (ST) have emerged as the major semi-supervised approaches to improve natural language understanding (NLU) tasks with massive amount of unlabeled data.

named-entity-recognition Named Entity Recognition +6

Paper
Add Code

Semi-Supervised Hypothesis Transfer for Source-Free Domain Adaptation

no code implementations • 14 Jul 2021 • Ning Ma, Jiajun Bu, Lixian Lu, Jun Wen, Zhen Zhang, Sheng Zhou, Xifeng Yan

Domain Adaptation has been widely used to deal with the distribution shift in vision, language, multimedia etc.

Source-Free Domain Adaptation

Paper
Add Code

Lifelong Learning of Hate Speech Classification on Social Media

no code implementations • NAACL 2021 • Jing Qian, Hong Wang, Mai ElSherief, Xifeng Yan

In this work, we propose lifelong learning of hate speech classification on social media.

Classification Representation Learning

Paper
Add Code

Inductive Relation Prediction by BERT

1 code implementation • 12 Mar 2021 • Hanwen Zha, Zhiyu Chen, Xifeng Yan

Relation prediction in knowledge graphs is dominated by embedding based methods which mainly focus on the transductive setting.

Few-Shot Learning Inductive Relation Prediction +4

Paper
Code

Composite Re-Ranking for Efficient Document Search with BERT

no code implementations • 11 Mar 2021 • Yingrui Yang, Yifan Qiao, Jinjin Shao, Mayuresh Anand, Xifeng Yan, Tao Yang

By applying token encoding on top of a dual-encoder architecture, BECR separates the attentions between a query and a document while capturing the contextual semantics of a query.

Re-Ranking

Paper
Add Code

Cross-modal Image Retrieval with Deep Mutual Information Maximization

no code implementations • 10 Mar 2021 • Chunbin Gu, Jiajun Bu, Xixi Zhou, Chengwei Yao, Dongfang Ma, Zhi Yu, Xifeng Yan

Prior work usually uses a three-stage strategy to tackle this task: 1) extract the features of the inputs; 2) fuse the feature of the source image and its modified text to obtain fusion feature; 3) learn a similarity metric between the desired image and the source image + modified text by using deep metric learning.

Cross-Modal Retrieval Image Retrieval +3

Paper
Add Code

Beyond I.I.D.: Three Levels of Generalization for Question Answering on Knowledge Bases

1 code implementation • 16 Nov 2020 • Yu Gu, Sue Kase, Michelle Vanni, Brian Sadler, Percy Liang, Xifeng Yan, Yu Su

To facilitate the development of KBQA models with stronger generalization, we construct and release a new large-scale, high-quality dataset with 64, 331 questions, GrailQA, and provide evaluation settings for all three levels of generalization.

Knowledge Base Question Answering

Paper
Code

Inter-Series Attention Model for COVID-19 Forecasting

1 code implementation • 25 Oct 2020 • Xiaoyong Jin, Yu-Xiang Wang, Xifeng Yan

COVID-19 pandemic has an unprecedented impact all over the world since early 2020.

Time Series Time Series Analysis

Paper
Code

CoCo: Controllable Counterfactuals for Evaluating Dialogue State Trackers

2 code implementations • ICLR 2021 • Shiyang Li, Semih Yavuz, Kazuma Hashimoto, Jia Li, Tong Niu, Nazneen Rajani, Xifeng Yan, Yingbo Zhou, Caiming Xiong

Dialogue state trackers have made significant progress on benchmark datasets, but their generalization capability to novel and realistic scenarios beyond the held-out conversations is less understood.

Ranked #2 on Multi-domain Dialogue State Tracking on MULTIWOZ 2.1 (using extra training data)

counterfactual Dialogue State Tracking +1

Paper
Code

KGPT: Knowledge-Grounded Pre-Training for Data-to-Text Generation

1 code implementation • EMNLP 2020 • Wenhu Chen, Yu Su, Xifeng Yan, William Yang Wang

We propose a knowledge-grounded pre-training (KGPT), which consists of two parts, 1) a general knowledge-grounded generation model to generate knowledge-enriched text.

Ranked #9 on KG-to-Text Generation on WebNLG 2.0 (Unconstrained)

General Knowledge KG-to-Text Generation +1

146

Paper
Code

Adaptive-Step Graph Meta-Learner for Few-Shot Graph Classification

no code implementations • 18 Mar 2020 • Ning Ma, Jiajun Bu, Jieyu Yang, Zhen Zhang, Chengwei Yao, Zhi Yu, Sheng Zhou, Xifeng Yan

The shared sub-structures between training classes and test classes are essential in few-shot graph classification.

Few-Shot Learning General Classification +3

Paper
Add Code

Neural Assistant: Joint Action Prediction, Response Generation, and Latent Knowledge Reasoning

1 code implementation • 31 Oct 2019 • Arvind Neelakantan, Semih Yavuz, Sharan Narang, Vishaal Prasad, Ben Goodrich, Daniel Duckworth, Chinnadhurai Sankar, Xifeng Yan

In this paper, we develop Neural Assistant: a single neural network model that takes conversation history and an external knowledge source as input and jointly produces both text response and action to be taken by the system as output.

Response Generation Retrieval +1

14,887

Paper
Code

You May Not Need Order in Time Series Forecasting

no code implementations • 21 Oct 2019 • Yunkai Zhang, Qiao Jiang, Shurui Li, Xiaoyong Jin, Xueying Ma, Xifeng Yan

Time series forecasting with limited data is a challenging yet critical task.

Time Series Time Series Forecasting

Paper
Add Code

Enhancing the Locality and Breaking the Memory Bottleneck of Transformer on Time Series Forecasting

2 code implementations • NeurIPS 2019 • Shiyang Li, Xiaoyong Jin, Yao Xuan, Xiyou Zhou, Wenhu Chen, Yu-Xiang Wang, Xifeng Yan

Time series forecasting is an important problem across many domains, including predictions of solar plant energy output, electricity consumption, and traffic jam situation.

Ranked #27 on Image Generation on ImageNet 64x64 (Bits per dim metric)

Time Series Time Series Forecasting

1,899

Paper
Code

Global Textual Relation Embedding for Relational Understanding

1 code implementation • ACL 2019 • Zhiyu Chen, Hanwen Zha, Honglei Liu, Wenhu Chen, Xifeng Yan, Yu Su

Pre-trained embeddings such as word embeddings and sentence embeddings are fundamental tools facilitating a wide range of downstream NLP tasks.

Ranked #142 on Action Classification on Kinetics-400

Action Classification Relation +3

Paper
Code

Semantically Conditioned Dialog Response Generation via Hierarchical Disentangled Self-Attention

2 code implementations • ACL 2019 • Wenhu Chen, Jianshu Chen, Pengda Qin, Xifeng Yan, William Yang Wang

Semantically controlled neural response generation on limited-domain has achieved great performance.

Ranked #5 on Data-to-Text Generation on MULTIWOZ 2.1

Data-to-Text Generation Inductive Bias +1

819

Paper
Code

How Large a Vocabulary Does Text Classification Need? A Variational Approach to Vocabulary Selection

1 code implementation • NAACL 2019 • Wenhu Chen, Yu Su, Yilin Shen, Zhiyu Chen, Xifeng Yan, William Wang

Under deep neural networks, a pre-defined vocabulary is required to vectorize text inputs.

General Classification text-classification +1

Paper
Code

What It Takes to Achieve 100\% Condition Accuracy on WikiSQL

no code implementations • EMNLP 2018 • Semih Yavuz, Izzeddin Gur, Yu Su, Xifeng Yan

The SQL queries in WikiSQL are simple: Each involves one relation and does not have any join operation.

Translation

Paper
Add Code

XL-NBT: A Cross-lingual Neural Belief Tracking Framework

1 code implementation • EMNLP 2018 • Wenhu Chen, Jianshu Chen, Yu Su, Xin Wang, Dong Yu, Xifeng Yan, William Yang Wang

Then, we pre-train a state tracker for the source language as a teacher, which is able to exploit easy-to-access parallel data.

Transfer Learning

Paper
Code

DialSQL: Dialogue Based Structured Query Generation

no code implementations • ACL 2018 • Izzeddin Gur, Semih Yavuz, Yu Su, Xifeng Yan

The recent advance in deep learning and semantic parsing has significantly improved the translation accuracy of natural language questions to structured queries.

Semantic Parsing Translation