Search Results for author: Jie Fu

Found 145 papers, 70 papers with code

Dynamic Generation of Personalities with Large Language Models

1 code implementation10 Apr 2024 Jianzhi Liu, Hexiang Gu, Tianyu Zheng, Liuyu Xiang, Huijia Wu, Jie Fu, Zhaofeng He

We propose a new metric to assess personality generation capability based on this evaluation method.

Personality Generation

Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model

no code implementations5 Apr 2024 Xinrun Du, Zhouliang Yu, Songyang Gao, Ding Pan, Yuyang Cheng, Ziyang Ma, Ruibin Yuan, Xingwei Qu, Jiaheng Liu, Tianyu Zheng, Xinchen Luo, Guorui Zhou, Binhang Yuan, Wenhu Chen, Jie Fu, Ge Zhang

In this study, we introduce CT-LLM, a 2B large language model (LLM) that illustrates a pivotal shift towards prioritizing the Chinese language in developing LLMs.

Language Modelling Large Language Model

RQ-RAG: Learning to Refine Queries for Retrieval Augmented Generation

no code implementations31 Mar 2024 Chi-Min Chan, Chunpu Xu, Ruibin Yuan, Hongyin Luo, Wei Xue, Yike Guo, Jie Fu

To this end, we propose learning to Refine Query for Retrieval Augmented Generation (RQ-RAG) in this paper, endeavoring to enhance the model by equipping it with capabilities for explicit rewriting, decomposition, and disambiguation.

In-Context Learning Response Generation +1

Preference-Based Planning in Stochastic Environments: From Partially-Ordered Temporal Goals to Most Preferred Policies

no code implementations27 Mar 2024 Hazhar Rahmani, Abhishek N. Kulkarni, Jie Fu

In the second step, we prove that finding a most preferred policy is equivalent to computing a Pareto-optimal policy in a multi-objective MDP that is constructed from the original MDP, the preference automaton, and the chosen stochastic ordering relation.

Electrocardiogram Instruction Tuning for Report Generation

no code implementations7 Mar 2024 Zhongwei Wan, Che Liu, Xin Wang, Chaofan Tao, Hui Shen, Zhenwu Peng, Jie Fu, Rossella Arcucci, Huaxiu Yao, Mi Zhang

Electrocardiogram (ECG) serves as the primary non-invasive diagnostic tool for cardiac conditions monitoring, are crucial in assisting clinicians.

DEEP-ICL: Definition-Enriched Experts for Language Model In-Context Learning

no code implementations7 Mar 2024 Xingwei Qu, Yiming Liang, Yucheng Wang, Tianyu Zheng, Tommy Yue, Lei Ma, Stephen W. Huang, Jiajun Zhang, Wenhu Chen, Chenghua Lin, Jie Fu, Ge Zhang

It has long been assumed that the sheer number of parameters in large language models (LLMs) drives in-context learning (ICL) capabilities, enabling remarkable performance improvements by leveraging task-specific demonstrations.

Few-Shot Learning In-Context Learning +1

StructLM: Towards Building Generalist Models for Structured Knowledge Grounding

no code implementations26 Feb 2024 Alex Zhuang, Ge Zhang, Tianyu Zheng, Xinrun Du, Junjie Wang, Weiming Ren, Stephen W. Huang, Jie Fu, Xiang Yue, Wenhu Chen

Utilizing this dataset, we train a series of models, referred to as StructLM, based on the Code-LLaMA architecture, ranging from 7B to 34B parameters.

m2mKD: Module-to-Module Knowledge Distillation for Modular Transformers

1 code implementation26 Feb 2024 Ka Man Lo, Yiming Liang, Wenyu Du, Yuantao Fan, Zili Wang, Wenhao Huang, Lei Ma, Jie Fu

Leveraging the knowledge from monolithic models, using techniques such as knowledge distillation, is likely to facilitate the training of modular models and enable them to integrate knowledge from multiple models pretrained on diverse sources.

Knowledge Distillation

OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement

no code implementations22 Feb 2024 Tianyu Zheng, Ge Zhang, Tianhao Shen, Xueling Liu, Bill Yuchen Lin, Jie Fu, Wenhu Chen, Xiang Yue

However, open-source models often lack the execution capabilities and iterative refinement of advanced systems like the GPT-4 Code Interpreter.

Code Generation

HyperMoE: Paying Attention to Unselected Experts in Mixture of Experts via Dynamic Transfer

1 code implementation20 Feb 2024 Hao Zhao, Zihan Qiu, Huijia Wu, Zili Wang, Zhaofeng He, Jie Fu

The Mixture of Experts (MoE) for language models has been proven effective in augmenting the capacity of models by dynamically routing each input token to a specific subset of experts for processing.

Multi-Task Learning

CMDAG: A Chinese Metaphor Dataset with Annotated Grounds as CoT for Boosting Metaphor Generation

1 code implementation20 Feb 2024 Yujie Shao, Xinrong Yao, Xingwei Qu, Chenghua Lin, Shi Wang, Stephen W. Huang, Ge Zhang, Jie Fu

These models are able to generate creative and fluent metaphor sentences more frequently induced by selected samples from our dataset, demonstrating the value of our corpus for Chinese metaphor research.

Defending Against Weight-Poisoning Backdoor Attacks for Parameter-Efficient Fine-Tuning

no code implementations19 Feb 2024 Shuai Zhao, Leilei Gan, Luu Anh Tuan, Jie Fu, Lingjuan Lyu, Meihuizi Jia, Jinming Wen

Motivated by this insight, we developed a Poisoned Sample Identification Module (PSIM) leveraging PEFT, which identifies poisoned samples through confidence, providing robust defense against weight-poisoning backdoor attacks.

Backdoor Attack text-classification +1

AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling

1 code implementation19 Feb 2024 Jun Zhan, Junqi Dai, Jiasheng Ye, Yunhua Zhou, Dong Zhang, Zhigeng Liu, Xin Zhang, Ruibin Yuan, Ge Zhang, Linyang Li, Hang Yan, Jie Fu, Tao Gui, Tianxiang Sun, Yugang Jiang, Xipeng Qiu

We introduce AnyGPT, an any-to-any multimodal language model that utilizes discrete representations for the unified processing of various modalities, including speech, text, images, and music.

Language Modelling Large Language Model

Empirical Study on Updating Key-Value Memories in Transformer Feed-forward Layers

1 code implementation19 Feb 2024 Zihan Qiu, Zeyu Huang, Youcheng Huang, Jie Fu

The feed-forward networks (FFNs) in transformers are recognized as a group of key-value neural memories to restore abstract high-level knowledge.

knowledge editing

Pixel Sentence Representation Learning

1 code implementation13 Feb 2024 Chenghao Xiao, Zhuoxu Huang, Danlu Chen, G Thomas Hudson, Yizhi Li, Haoran Duan, Chenghua Lin, Jie Fu, Jungong Han, Noura Al Moubayed

To our knowledge, this is the first representation learning method devoid of traditional language models for understanding sentence and document semantics, marking a stride closer to human-like textual comprehension.

Natural Language Inference Representation Learning +3

Read to Play (R2-Play): Decision Transformer with Multimodal Game Instruction

1 code implementation6 Feb 2024 Yonggang Jin, Ge Zhang, Hao Zhao, Tianyu Zheng, Jiawei Guo, Liuyu Xiang, Shawn Yue, Stephen W. Huang, Zhaofeng He, Jie Fu

Drawing inspiration from the success of multimodal instruction tuning in visual tasks, we treat the visual-based RL task as a long-horizon vision task and construct a set of multimodal game instructions to incorporate instruction tuning into a decision transformer.

SciMMIR: Benchmarking Scientific Multi-modal Information Retrieval

1 code implementation24 Jan 2024 Siwei Wu, Yizhi Li, Kang Zhu, Ge Zhang, Yiming Liang, Kaijing Ma, Chenghao Xiao, Haoran Zhang, Bohao Yang, Wenhu Chen, Wenhao Huang, Noura Al Moubayed, Jie Fu, Chenghua Lin

We further annotate the image-text pairs with two-level subset-subcategory hierarchy annotations to facilitate a more comprehensive evaluation of the baselines.

Benchmarking Image Captioning +3

CMMMU: A Chinese Massive Multi-discipline Multimodal Understanding Benchmark

1 code implementation22 Jan 2024 Ge Zhang, Xinrun Du, Bei Chen, Yiming Liang, Tongxu Luo, Tianyu Zheng, Kang Zhu, Yuyang Cheng, Chunpu Xu, Shuyue Guo, Haoran Zhang, Xingwei Qu, Junjie Wang, Ruibin Yuan, Yizhi Li, Zekun Wang, Yudong Liu, Yu-Hsuan Tsai, Fengji Zhang, Chenghua Lin, Wenhao Huang, Wenhu Chen, Jie Fu

We introduce CMMMU, a new Chinese Massive Multi-discipline Multimodal Understanding benchmark designed to evaluate LMMs on tasks demanding college-level subject knowledge and deliberate reasoning in a Chinese context.

E^2-LLM: Efficient and Extreme Length Extension of Large Language Models

no code implementations13 Jan 2024 Jiaheng Liu, Zhiqi Bai, Yuanxing Zhang, Chenchen Zhang, Yu Zhang, Ge Zhang, Jiakai Wang, Haoran Que, Yukang Chen, Wenbo Su, Tiezheng Ge, Jie Fu, Wenhu Chen, Bo Zheng

Typically, training LLMs with long context sizes is computationally expensive, requiring extensive training hours and GPU resources.

4k Position

Kun: Answer Polishment for Chinese Self-Alignment with Instruction Back-Translation

1 code implementation12 Jan 2024 Tianyu Zheng, Shuyue Guo, Xingwei Qu, Jiawei Guo, Weixu Zhang, Xinrun Du, Qi Jia, Chenghua Lin, Wenhao Huang, Wenhu Chen, Jie Fu, Ge Zhang

In this paper, we introduce Kun, a novel approach for creating high-quality instruction-tuning datasets for large language models (LLMs) without relying on manual annotations.

Instruction Following Translation

Align on the Fly: Adapting Chatbot Behavior to Established Norms

1 code implementation26 Dec 2023 Chunpu Xu, Steffi Chern, Ethan Chern, Ge Zhang, Zekun Wang, Ruibo Liu, Jing Li, Jie Fu, PengFei Liu

In this paper, we aim to align large language models with the ever-changing, complex, and diverse human values (e. g., social norms) across time and locations.

Chatbot

TACO: Topics in Algorithmic COde generation dataset

1 code implementation22 Dec 2023 Rongao Li, Jie Fu, Bo-Wen Zhang, Tao Huang, Zhihong Sun, Chen Lyu, Guang Liu, Zhi Jin, Ge Li

Moreover, each TACO problem includes several fine-grained labels such as task topics, algorithms, programming skills, and difficulty levels, providing a more precise reference for the training and evaluation of code generation models.

Code Generation

Scalable Geometric Fracture Assembly via Co-creation Space among Assemblers

1 code implementation19 Dec 2023 Ruiyuan Zhang, Jiaxiang Liu, Zexi Li, Hao Dong, Jie Fu, Chao Wu

Therefore, there is a need to develop a scalable framework for geometric fracture assembly without relying on semantic information.

3D Assembly

SynFundus-1M: A High-quality Million-scale Synthetic fundus images Dataset with Fifteen Types of Annotation

1 code implementation1 Dec 2023 Fangxin Shang, Jie Fu, Yehui Yang, Haifeng Huang, Junwei Liu, Lei Ma

Large-scale public datasets with high-quality annotations are rarely available for intelligent medical imaging research, due to data privacy concerns and the cost of annotations.

Denoising

UniIR: Training and Benchmarking Universal Multimodal Information Retrievers

no code implementations28 Nov 2023 Cong Wei, Yang Chen, Haonan Chen, Hexiang Hu, Ge Zhang, Jie Fu, Alan Ritter, Wenhu Chen

Existing information retrieval (IR) models often assume a homogeneous format, limiting their applicability to diverse user needs, such as searching for images with text descriptions, searching for a news article with a headline image, or finding a similar photo with a query image.

Benchmarking Information Retrieval +2

DPSUR: Accelerating Differentially Private Stochastic Gradient Descent Using Selective Update and Release

1 code implementation23 Nov 2023 Jie Fu, Qingqing Ye, Haibo Hu, Zhili Chen, Lulu Wang, Kuncan Wang, Xun Ran

Motivated by this, this paper proposes DPSUR, a Differentially Private training framework based on Selective Updates and Release, where the gradient from each iteration is evaluated based on a validation test, and only those updates leading to convergence are applied to the model.

Privacy Preserving

Massive Editing for Large Language Models via Meta Learning

1 code implementation8 Nov 2023 Chenmien Tan, Ge Zhang, Jie Fu

While large language models (LLMs) have enabled learning knowledge from the pre-training corpora, the acquired knowledge may be fundamentally incorrect or outdated over time, which necessitates rectifying the knowledge of the language model (LM) after the training.

Fact Checking Language Modelling +3

AI Alignment: A Comprehensive Survey

no code implementations30 Oct 2023 Jiaming Ji, Tianyi Qiu, Boyuan Chen, Borong Zhang, Hantao Lou, Kaile Wang, Yawen Duan, Zhonghao He, Jiayi Zhou, Zhaowei Zhang, Fanzhi Zeng, Kwan Yee Ng, Juntao Dai, Xuehai Pan, Aidan O'Gara, Yingshan Lei, Hua Xu, Brian Tse, Jie Fu, Stephen Mcaleer, Yaodong Yang, Yizhou Wang, Song-Chun Zhu, Yike Guo, Wen Gao

The former aims to make AI systems aligned via alignment training, while the latter aims to gain evidence about the systems' alignment and govern them appropriately to avoid exacerbating misalignment risks.

Covert Planning against Imperfect Observers

no code implementations25 Oct 2023 Haoxiang Ma, Chongyang Shi, Shuo Han, Michael R. Dorothy, Jie Fu

This paper studies how covert planning can leverage the coupling of stochastic dynamics and the observer's imperfect observation to achieve optimal task performance without being detected.

Prototype-based HyperAdapter for Sample-Efficient Multi-task Tuning

1 code implementation18 Oct 2023 Hao Zhao, Jie Fu, Zhaofeng He

Parameter-efficient fine-tuning (PEFT) has shown its effectiveness in adapting the pre-trained language models to downstream tasks while only updating a small number of parameters.

Multi-Task Learning

Unlocking Emergent Modularity in Large Language Models

1 code implementation17 Oct 2023 Zihan Qiu, Zeyu Huang, Jie Fu

Despite the benefits of modularity, most Language Models (LMs) are still treated as monolithic models in the pre-train and fine-tune paradigm, with their emergent modularity locked and underutilized.

Domain Generalization Transfer Learning

Heterogenous Memory Augmented Neural Networks

1 code implementation17 Oct 2023 Zihan Qiu, Zhen Liu, Shuicheng Yan, Shanghang Zhang, Jie Fu

It has been shown that semi-parametric methods, which combine standard neural networks with non-parametric components such as external memory modules and data retrieval, are particularly helpful in data scarcity and out-of-distribution (OOD) scenarios.

Retrieval

RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models

1 code implementation1 Oct 2023 Zekun Moore Wang, Zhongyuan Peng, Haoran Que, Jiaheng Liu, Wangchunshu Zhou, Yuhan Wu, Hongcheng Guo, Ruitong Gan, Zehao Ni, Man Zhang, Zhaoxiang Zhang, Wanli Ouyang, Ke Xu, Wenhu Chen, Jie Fu, Junran Peng

The advent of Large Language Models (LLMs) has paved the way for complex tasks such as role-playing, which enhances user interactions by enabling models to imitate various characters.

Benchmarking

AutoAgents: A Framework for Automatic Agent Generation

1 code implementation29 Sep 2023 Guangyao Chen, Siwei Dong, Yu Shu, Ge Zhang, Jaward Sesay, Börje F. Karlsson, Jie Fu, Yemin Shi

Therefore, we introduce AutoAgents, an innovative framework that adaptively generates and coordinates multiple specialized agents to build an AI team according to different tasks.

ALI-DPFL: Differentially Private Federated Learning with Adaptive Local Iterations

no code implementations21 Aug 2023 XinPeng Ling, Jie Fu, Kuncan Wang, Haitao Liu, Zhili Chen

Federated Learning (FL) is a distributed machine learning technique that allows model training among multiple devices or organizations by sharing training parameters instead of raw data.

Federated Learning

ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate

1 code implementation14 Aug 2023 Chi-Min Chan, Weize Chen, Yusheng Su, Jianxuan Yu, Wei Xue, Shanghang Zhang, Jie Fu, Zhiyuan Liu

Text evaluation has historically posed significant challenges, often demanding substantial labor and time cost.

Text Generation

On the Effectiveness of Speech Self-supervised Learning for Music

no code implementations11 Jul 2023 Yinghao Ma, Ruibin Yuan, Yizhi Li, Ge Zhang, Xingran Chen, Hanzhi Yin, Chenghua Lin, Emmanouil Benetos, Anton Ragni, Norbert Gyenge, Ruibo Liu, Gus Xia, Roger Dannenberg, Yike Guo, Jie Fu

Our findings suggest that training with music data can generally improve performance on MIR tasks, even when models are trained using paradigms designed for speech.

Information Retrieval Music Information Retrieval +2

Multimodal Imbalance-Aware Gradient Modulation for Weakly-supervised Audio-Visual Video Parsing

no code implementations5 Jul 2023 Jie Fu, Junyu Gao, Changsheng Xu

In this paper, to balance the feature learning processes of different modalities, a dynamic gradient modulation (DGM) mechanism is explored, where a novel and effective metric function is designed to measure the imbalanced feature learning between audio and visual modalities.

LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT

1 code implementation29 Jun 2023 Le Zhuo, Ruibin Yuan, Jiahao Pan, Yinghao Ma, Yizhi Li, Ge Zhang, Si Liu, Roger Dannenberg, Jie Fu, Chenghua Lin, Emmanouil Benetos, Wenhu Chen, Wei Xue, Yike Guo

We introduce LyricWhiz, a robust, multilingual, and zero-shot automatic lyrics transcription method achieving state-of-the-art performance on various lyrics transcription datasets, even in challenging genres such as rock and metal.

Automatic Lyrics Transcription Language Modelling +3

Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs

1 code implementation22 Jun 2023 Miao Xiong, Zhiyuan Hu, Xinyang Lu, Yifei Li, Jie Fu, Junxian He, Bryan Hooi

To better break down the problem, we define a systematic framework with three components: prompting strategies for eliciting verbalized confidence, sampling methods for generating multiple responses, and aggregation techniques for computing consistency.

Arithmetic Reasoning Benchmarking +1

Deep Reinforcement Learning with Task-Adaptive Retrieval via Hypernetwork

1 code implementation19 Jun 2023 Yonggang Jin, Chenxu Wang, Tianyu Zheng, Liuyu Xiang, Yaodong Yang, Junge Zhang, Jie Fu, Zhaofeng He

Deep reinforcement learning algorithms are usually impeded by sampling inefficiency, heavily depending on multiple interactions with the environment to acquire accurate decision-making capabilities.

Decision Making Hippocampus +2

Med-UniC: Unifying Cross-Lingual Medical Vision-Language Pre-Training by Diminishing Bias

1 code implementation NeurIPS 2023 Zhongwei Wan, Che Liu, Mi Zhang, Jie Fu, Benyou Wang, Sibo Cheng, Lei Ma, César Quilodrán-Casas, Rossella Arcucci

Med-UniC reaches superior performance across 5 medical image tasks and 10 datasets encompassing over 30 diseases, offering a versatile framework for unifying multi-modal medical data within diverse linguistic communities.

Disentanglement

TPDM: Selectively Removing Positional Information for Zero-shot Translation via Token-Level Position Disentangle Module

no code implementations31 May 2023 Xingran Chen, Ge Zhang, Jie Fu

Due to Multilingual Neural Machine Translation's (MNMT) capability of zero-shot translation, many works have been carried out to fully exploit the potential of MNMT in zero-shot translation.

Position Translation

HUB: Guiding Learned Optimizers with Continuous Prompt Tuning

no code implementations26 May 2023 Gaole Dai, Wei Wu, Ziyu Wang, Jie Fu, Shanghang Zhang, Tiejun Huang

By incorporating hand-designed optimizers as the second component in our hybrid approach, we are able to retain the benefits of learned optimizers while stabilizing the training process and, more importantly, improving testing performance.

Meta-Learning

Think Before You Act: Decision Transformers with Internal Working Memory

1 code implementation24 May 2023 Jikun Kang, Romain Laroche, Xindi Yuan, Adam Trischler, Xue Liu, Jie Fu

We argue that this inefficiency stems from the forgetting phenomenon, in which a model memorizes its behaviors in parameters throughout training.

Atari Games Decision Making +2

Interactive Natural Language Processing

no code implementations22 May 2023 Zekun Wang, Ge Zhang, Kexin Yang, Ning Shi, Wangchunshu Zhou, Shaochun Hao, Guangzheng Xiong, Yizhi Li, Mong Yuan Sim, Xiuying Chen, Qingqing Zhu, Zhenzhu Yang, Adam Nik, Qi Liu, Chenghua Lin, Shi Wang, Ruibo Liu, Wenhu Chen, Ke Xu, Dayiheng Liu, Yike Guo, Jie Fu

Interactive Natural Language Processing (iNLP) has emerged as a novel paradigm within the field of NLP, aimed at addressing limitations in existing frameworks while aligning with the ultimate goals of artificial intelligence.

Decision Making

Assessing Hidden Risks of LLMs: An Empirical Study on Robustness, Consistency, and Credibility

1 code implementation15 May 2023 Wentao Ye, Mingfeng Ou, Tianyi Li, Yipeng chen, Xuetao Ma, Yifan Yanggong, Sai Wu, Jie Fu, Gang Chen, Haobo Wang, Junbo Zhao

With most of the related literature in the era of LLM uncharted, we propose an automated workflow that copes with an upscaled number of queries/responses.

Memorization

Huatuo-26M, a Large-scale Chinese Medical QA Dataset

1 code implementation2 May 2023 Jianquan Li, Xidong Wang, Xiangbo Wu, Zhiyi Zhang, Xiaolong Xu, Jie Fu, Prayag Tiwari, Xiang Wan, Benyou Wang

Moreover, we also experimentally show the benefit of the proposed dataset in many aspects: (i) trained models for other QA datasets in a zero-shot fashion; and (ii) as external knowledge for retrieval-augmented generation (RAG); and (iii) improving existing pre-trained language models by using the QA pairs as a pre-training corpus in continued training manner.

Language Modelling Question Answering +1

Prompt as Triggers for Backdoor Attack: Examining the Vulnerability in Language Models

no code implementations2 May 2023 Shuai Zhao, Jinming Wen, Luu Anh Tuan, Junbo Zhao, Jie Fu

Our method does not require external triggers and ensures correct labeling of poisoned samples, improving the stealthy nature of the backdoor attack.

Backdoor Attack Few-Shot Text Classification +1

MUDiff: Unified Diffusion for Complete Molecule Generation

no code implementations28 Apr 2023 Chenqing Hua, Sitao Luan, Minkai Xu, Rex Ying, Jie Fu, Stefano Ermon, Doina Precup

Our model is a promising approach for designing stable and diverse molecules and can be applied to a wide range of tasks in molecular modeling.

Drug Discovery valid

When Do Graph Neural Networks Help with Node Classification? Investigating the Impact of Homophily Principle on Node Distinguishability

1 code implementation25 Apr 2023 Sitao Luan, Chenqing Hua, Minkai Xu, Qincheng Lu, Jiaqi Zhu, Xiao-Wen Chang, Jie Fu, Jure Leskovec, Doina Precup

Homophily principle, i. e., nodes with the same labels are more likely to be connected, has been believed to be the main reason for the performance superiority of Graph Neural Networks (GNNs) over Neural Networks on node classification tasks.

Node Classification Stochastic Block Model

Probabilistic Planning with Prioritized Preferences over Temporal Logic Objectives

no code implementations23 Apr 2023 Lening Li, Hazhar Rahmani, Jie Fu

We demonstrate the efficacy and applicability of the logic and the algorithm on several case studies with detailed analyses for each.

Chinese Open Instruction Generalist: A Preliminary Release

2 code implementations17 Apr 2023 Ge Zhang, Yemin Shi, Ruibo Liu, Ruibin Yuan, Yizhi Li, Siwei Dong, Yu Shu, Zhaoqun Li, Zekun Wang, Chenghua Lin, Wenhao Huang, Jie Fu

Instruction tuning is widely recognized as a key technique for building generalist language models, which has attracted the attention of researchers and the public with the release of InstructGPT~\citep{ouyang2022training} and ChatGPT\footnote{\url{https://chat. openai. com/}}.

Chain of Thought Prompt Tuning in Vision Language Models

no code implementations16 Apr 2023 Jiaxin Ge, Hongyin Luo, Siyuan Qian, Yulu Gan, Jie Fu, Shanghang Zhang

Chain of Thought is a simple and effective approximation to human reasoning process and has been proven useful for natural language processing (NLP) tasks.

Domain Generalization Image Classification +4

Synthesis of Opacity-Enforcing Winning Strategies Against Colluded Opponent

no code implementations3 Apr 2023 Chongyang Shi, Abhishek N. Kulkarni, Hazhar Rahmani, Jie Fu

Furthermore, if such a strategy does not exist, winning for P1 must entail the price of revealing his secret to the observer.

Motion Planning

Modular Retrieval for Generalization and Interpretation

1 code implementation23 Mar 2023 Juhao Liang, Chen Zhang, Zhengyang Tang, Jie Fu, Dawei Song, Benyou Wang

Built upon the paradigm, we propose a retrieval model with modular prompt tuning named REMOP.

Language Modelling Retrieval

A Pathway Towards Responsible AI Generated Content

no code implementations2 Mar 2023 Chen Chen, Jie Fu, Lingjuan Lyu

AI Generated Content (AIGC) has received tremendous attention within the past few years, with content generated in the format of image, text, audio, video, etc.

Misinformation

Quantitative Planning with Action Deception in Concurrent Stochastic Games

no code implementations3 Jan 2023 Chongyang Shi, Shuo Han, Jie Fu

In this setup, we investigate P1's strategic planning of action deception that decides when to deviate from the Nash equilibrium in P2's game model and employ a hidden action, so that P1 can maximize the value of action deception, which is the additional payoff compared to P1's payoff in the game where P2 has complete information.

Motion Planning

CORGI-PM: A Chinese Corpus For Gender Bias Probing and Mitigation

1 code implementation1 Jan 2023 Ge Zhang, Yizhi Li, Yaoyao Wu, Linyuan Zhang, Chenghua Lin, Jiayi Geng, Shi Wang, Jie Fu

As natural language processing (NLP) for gender bias becomes a significant interdisciplinary topic, the prevalent data-driven techniques such as large-scale language models suffer from data inadequacy and biased corpus, especially for languages with insufficient resources such as Chinese.

Sentence

A Close Look at Spatial Modeling: From Attention to Convolution

1 code implementation23 Dec 2022 Xu Ma, Huan Wang, Can Qin, Kunpeng Li, Xingchen Zhao, Jie Fu, Yun Fu

Vision Transformers have shown great promise recently for many vision tasks due to the insightful architecture design and attention mechanism.

Instance Segmentation object-detection +2

Adap DP-FL: Differentially Private Federated Learning with Adaptive Noise

no code implementations29 Nov 2022 Jie Fu, Zhili Chen, Xiao Han

The heterogeneity and convergence of training parameters were simply not considered.

Federated Learning

SA-DPSGD: Differentially Private Stochastic Gradient Descent based on Simulated Annealing

no code implementations14 Nov 2022 Jie Fu, Zhili Chen, XinPeng Ling

Differentially private stochastic gradient descent (DPSGD) is the most popular training method with differential privacy in image recognition.

Image Classification

Reconciliation of Pre-trained Models and Prototypical Neural Networks in Few-shot Named Entity Recognition

1 code implementation7 Nov 2022 Youcheng Huang, Wenqiang Lei, Jie Fu, Jiancheng Lv

Incorporating large-scale pre-trained models with the prototypical neural networks is a de-facto paradigm in few-shot named entity recognition.

named-entity-recognition Named Entity Recognition +1

HERB: Measuring Hierarchical Regional Bias in Pre-trained Language Models

1 code implementation5 Nov 2022 Yizhi Li, Ge Zhang, Bohao Yang, Chenghua Lin, Shi Wang, Anton Ragni, Jie Fu

In addition to verifying the existence of regional bias in LMs, we find that the biases on regional groups can be strongly influenced by the geographical clustering of the groups.

Fairness

1Cademy @ Causal News Corpus 2022: Leveraging Self-Training in Causality Classification of Socio-Political Event Data

1 code implementation4 Nov 2022 Adam Nik, Ge Zhang, Xingran Chen, Mingyu Li, Jie Fu

This paper details our participation in the Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE) workshop @ EMNLP 2022, where we take part in Subtask 1 of Shared Task 3.

1Cademy @ Causal News Corpus 2022: Enhance Causal Span Detection via Beam-Search-based Position Selector

1 code implementation31 Oct 2022 Xingran Chen, Ge Zhang, Adam Nik, Mingyu Li, Jie Fu

In this paper, we present our approach and empirical observations for Cause-Effect Signal Span Detection -- Subtask 2 of Shared task 3~\cite{tan-etal-2022-event} at CASE 2022.

Data Augmentation Language Modelling +3

Text Editing as Imitation Game

1 code implementation21 Oct 2022 Ning Shi, Bin Tang, Bo Yuan, Longtao Huang, Yewen Pu, Jie Fu, Zhouhan Lin

Text editing, such as grammatical error correction, arises naturally from imperfect textual data.

Action Generation Grammatical Error Correction +1

Opportunistic Qualitative Planning in Stochastic Systems with Incomplete Preferences over Reachability Objectives

no code implementations4 Oct 2022 Abhishek N. Kulkarni, Jie Fu

We construct a model called an improvement MDP, in which the synthesis of SPI and SASI strategies that guarantee at least one improvement reduces to computing positive and almost-sure winning strategies in an MDP.

Motion Planning

Probabilistic Planning with Partially Ordered Preferences over Temporal Goals

no code implementations25 Sep 2022 Hazhar Rahmani, Abhishek N. Kulkarni, Jie Fu

We prove that a weak-stochastic nondominated policy given the preference specification is Pareto-optimal in the constructed multi-objective MDP, and vice versa.

On Almost-Sure Intention Deception Planning that Exploits Imperfect Observers

no code implementations1 Sep 2022 Jie Fu

The synthesized attack strategy not only ensures the attack objective is satisfied almost surely but also deceives the defender into believing that the observed behavior is generated by a normal/legitimate user and thus failing to detect the presence of an attack.

Pathway to Future Symbiotic Creativity

no code implementations18 Aug 2022 Yike Guo, Qifeng Liu, Jie Chen, Wei Xue, Jie Fu, Henrik Jensen, Fernando Rosas, Jeffrey Shaw, Xing Wu, Jiji Zhang, Jianliang Xu

This report presents a comprehensive view of our vision on the development path of the human-machine symbiotic art creation.

Philosophy

Graph Neural Networks Intersect Probabilistic Graphical Models: A Survey

no code implementations24 May 2022 Chenqing Hua, Sitao Luan, Qian Zhang, Jie Fu

Graph Neural Networks (GNNs) are new inference methods developed in recent years and are attracting growing attention due to their effectiveness and flexibility in solving inference and learning problems over graph-structured data.

Biological Sequence Design with GFlowNets

1 code implementation2 Mar 2022 Moksh Jain, Emmanuel Bengio, Alex-Hernandez Garcia, Jarrid Rector-Brooks, Bonaventure F. P. Dossou, Chanakya Ekbote, Jie Fu, Tianyu Zhang, Micheal Kilgour, Dinghuai Zhang, Lena Simine, Payel Das, Yoshua Bengio

In this work, we propose an active learning algorithm leveraging epistemic uncertainty estimation and the recently proposed GFlowNets as a generator of diverse candidate solutions, with the objective to obtain a diverse batch of useful (as defined by some utility function, for example, the predicted anti-microbial activity of a peptide) and informative candidates after each round.

Active Learning

MentalBERT: Publicly Available Pretrained Language Models for Mental Healthcare

no code implementations LREC 2022 Shaoxiong Ji, Tianlin Zhang, Luna Ansari, Jie Fu, Prayag Tiwari, Erik Cambria

Mental health is a critical issue in modern society, and mental disorders could sometimes turn to suicidal ideation without adequate treatment.

Pre-trained Language Models in Biomedical Domain: A Systematic Survey

1 code implementation11 Oct 2021 Benyou Wang, Qianqian Xie, Jiahuan Pei, Zhihong Chen, Prayag Tiwari, Zhao Li, Jie Fu

In this paper, we summarize the recent progress of pre-trained language models in the biomedical domain and their applications in biomedical downstream tasks.

Learning Multi-Objective Curricula for Robotic Policy Learning

1 code implementation6 Oct 2021 Jikun Kang, Miao Liu, Abhinav Gupta, Chris Pal, Xue Liu, Jie Fu

Various automatic curriculum learning (ACL) methods have been proposed to improve the sample efficiency and final performance of deep reinforcement learning (DRL).

Reinforcement Learning (RL)

Unifying Likelihood-free Inference with Black-box Optimization and Beyond

no code implementations ICLR 2022 Dinghuai Zhang, Jie Fu, Yoshua Bengio, Aaron Courville

Black-box optimization formulations for biological sequence design have drawn recent attention due to their promising potential impact on the pharmaceutical industry.

Drug Discovery

Evolving Decomposed Plasticity Rules for Information-Bottlenecked Meta-Learning

2 code implementations8 Sep 2021 Fan Wang, Hao Tian, Haoyi Xiong, Hua Wu, Jie Fu, Yang Cao, Yu Kang, Haifeng Wang

In contrast, biological neural networks (BNNs) can adapt to various new tasks by continually updating the neural connections based on the inputs, which is aligned with the paradigm of learning effective learning rules in addition to static parameters, e. g., meta-learning.

Memorization Meta-Learning

Pruning Ternary Quantization

no code implementations23 Jul 2021 Dan Liu, Xi Chen, Jie Fu, Chen Ma, Xue Liu

To simultaneously optimize bit-width, model size, and accuracy, we propose pruning ternary quantization (PTQ): a simple, effective, symmetric ternary quantization method.

Image Classification Model Compression +3

Towards Representation Identical Privacy-Preserving Graph Neural Network via Split Learning

no code implementations13 Jul 2021 Chuanqiang Shan, Huiyun Jiao, Jie Fu

In recent years, the fast rise in number of studies on graph neural network (GNN) has put it from the theories research to reality application stage.

Privacy Preserving

Few-Shot Domain Adaptation with Polymorphic Transformers

1 code implementation10 Jul 2021 Shaohua Li, Xiuchao Sui, Jie Fu, Huazhu Fu, Xiangde Luo, Yangqin Feng, Xinxing Xu, Yong liu, Daniel Ting, Rick Siow Mong Goh

Thus, the chance of overfitting the annotations is greatly reduced, and the model can perform robustly on the target domain after being trained on a few annotated images.

Domain Adaptation Segmentation

Probabilistic Planning with Preferences over Temporal Goals

no code implementations26 Mar 2021 Jie Fu

We present a formal language for specifying qualitative preferences over temporal goals and a preference-based planning method in stochastic systems.

Temporal Sequences

FloW: A Dataset and Benchmark for Floating Waste Detection in Inland Waters

1 code implementation ICCV 2021 Yuwei Cheng, Jiannan Zhu, Mengxin Jiang, Jie Fu, Changsong Pang, Peidong Wang, Kris Sankaran, Olawale Onabola, Yimin Liu, Dianbo Liu, Yoshua Bengio

To promote the practical application for autonomous floating wastes cleaning, we present FloW, the first dataset for floating waste detection in inland water areas.

object-detection Robust Object Detection

Attention-Based Planning with Active Perception

no code implementations30 Nov 2020 Haoxiang Ma, Jie Fu

By switching between different attention modes, the robot actively perceives task-relevant information to reduce the cost of information acquisition and processing, while achieving near-optimal task performance.

Semantic SLAM with Autonomous Object-Level Data Association

no code implementations20 Nov 2020 Zhentian Qian, Kartik Patath, Jie Fu, Jing Xiao

It is often desirable to capture and map semantic information of an environment during simultaneous localization and mapping (SLAM).

Object Semantic SLAM

Tell Me How to Ask Again: Question Data Augmentation with Controllable Rewriting in Continuous Space

1 code implementation EMNLP 2020 Dayiheng Liu, Yeyun Gong, Jie Fu, Yu Yan, Jiusheng Chen, Jiancheng Lv, Nan Duan, Ming Zhou

In this paper, we propose a novel data augmentation method, referred to as Controllable Rewriting based Question Data Augmentation (CRQDA), for machine reading comprehension (MRC), question generation, and question-answering natural language inference tasks.

Data Augmentation Machine Reading Comprehension +6

A Theory of Hypergames on Graphs for Synthesizing Dynamic Cyber Defense with Deception

no code implementations7 Aug 2020 Abhishek N. Kulkarni, Jie Fu

Given qualitative security specifications in formal logic, we show that the solution concepts from hypergames and reactive synthesis in formal methods can be extended to synthesize effective dynamic defense strategy using cyber deception.

Formal Logic

CoCon: A Self-Supervised Approach for Controlled Text Generation

1 code implementation ICLR 2021 Alvin Chan, Yew-Soon Ong, Bill Pung, Aston Zhang, Jie Fu

While there are studies that seek to control high-level attributes (such as sentiment and topic) of generated text, there is still a lack of more precise control over its content at the word- and phrase-level.

Text Generation

RikiNet: Reading Wikipedia Pages for Natural Question Answering

no code implementations ACL 2020 Dayiheng Liu, Yeyun Gong, Jie Fu, Yu Yan, Jiusheng Chen, Daxin Jiang, Jiancheng Lv, Nan Duan

The representations are then fed into the predictor to obtain the span of the short answer, the paragraph of the long answer, and the answer type in a cascaded manner.

Natural Language Understanding Natural Questions +1

Role-Wise Data Augmentation for Knowledge Distillation

1 code implementation ICLR 2020 Jie Fu, Xue Geng, Zhijian Duan, Bohan Zhuang, Xingdi Yuan, Adam Trischler, Jie Lin, Chris Pal, Hao Dong

To our knowledge, existing methods overlook the fact that although the student absorbs extra knowledge from the teacher, both models share the same input data -- and this data is the only medium by which the teacher's knowledge can be demonstrated.

Data Augmentation Knowledge Distillation

Feature Lenses: Plug-and-play Neural Modules for Transformation-Invariant Visual Representations

1 code implementation12 Apr 2020 Shaohua Li, Xiuchao Sui, Jie Fu, Yong liu, Rick Siow Mong Goh

To make CNNs more invariant to transformations, we propose "Feature Lenses", a set of ad-hoc modules that can be easily plugged into a trained model (referred to as the "host model").

Diverse, Controllable, and Keyphrase-Aware: A Corpus and Method for News Multi-Headline Generation

1 code implementation EMNLP 2020 Dayiheng Liu, Yeyun Gong, Jie Fu, Wei Liu, Yu Yan, Bo Shao, Daxin Jiang, Jiancheng Lv, Nan Duan

Furthermore, we propose a simple and effective method to mine the keyphrases of interest in the news article and build a first large-scale keyphrase-aware news headline corpus, which contains over 180K aligned triples of $<$news article, headline, keyphrase$>$.

Headline Generation Sentence

A Multi-Agent Reinforcement Learning Approach For Safe and Efficient Behavior Planning Of Connected Autonomous Vehicles

no code implementations9 Mar 2020 Songyang Han, Shanglin Zhou, Jiangwei Wang, Lynn Pepin, Caiwen Ding, Jie Fu, Fei Miao

The truncated Q-function utilizes the shared information from neighboring CAVs such that the joint state and action spaces of the Q-function do not grow in our algorithm for a large-scale CAV system.

Autonomous Vehicles Multi-agent Reinforcement Learning +1

Learning to Locomote with Deep Neural-Network and CPG-based Control in a Soft Snake Robot

no code implementations13 Jan 2020 Xuan Liu, Renato Gasoto, Cagdas Onal, Jie Fu

Inspired by biological snakes, our control architecture is composed of two key modules: A deep reinforcement learning (RL) module for achieving adaptive goal-tracking behaviors with changing goals, and a central pattern generator (CPG) system with Matsuoka oscillators for generating stable and diverse locomotion patterns.

Reinforcement Learning (RL)

Jacobian Adversarially Regularized Networks for Robustness

1 code implementation ICLR 2020 Alvin Chan, Yi Tay, Yew Soon Ong, Jie Fu

Adversarial examples are crafted with imperceptible perturbations with the intent to fool neural networks.

Deep Learning-based Radiomic Features for Improving Neoadjuvant Chemoradiation Response Prediction in Locally Advanced Rectal Cancer

no code implementations9 Sep 2019 Jie Fu, Xinran Zhong, Ning li, Ritchell Van Dams, John Lewis, Kyunghyun Sung, Ann C. Raldow, Jing Jin, X. Sharon Qi

The model built with handcrafted features achieved the mean area under the ROC curve (AUC) of 0. 64, while the one built with DL-based features yielded the mean AUC of 0. 73.

Survival Prediction

Exploring Domain Shift in Extractive Text Summarization

no code implementations30 Aug 2019 Danqing Wang, PengFei Liu, Ming Zhong, Jie Fu, Xipeng Qiu, Xuanjing Huang

Although domain shift has been well explored in many NLP applications, it still has received little attention in the domain of extractive text summarization.

Extractive Text Summarization Meta-Learning

Interactive Machine Comprehension with Information Seeking Agents

1 code implementation ACL 2020 Xingdi Yuan, Jie Fu, Marc-Alexandre Cote, Yi Tay, Christopher Pal, Adam Trischler

Existing machine reading comprehension (MRC) models do not scale effectively to real-world applications like web-level information retrieval and question answering (QA).

Decision Making Information Retrieval +3

Conditional Computation for Continual Learning

no code implementations16 Jun 2019 Min Lin, Jie Fu, Yoshua Bengio

In this study, we analyze parameter sharing under the conditional computation framework where the parameters of a neural network are conditioned on each input example.

Continual Learning

Revision in Continuous Space: Unsupervised Text Style Transfer without Adversarial Learning

1 code implementation29 May 2019 Dayiheng Liu, Jie Fu, Yidan Zhang, Chris Pal, Jiancheng Lv

We propose a new framework that utilizes the gradients to revise the sentence in a continuous space during inference to achieve text style transfer.

Attribute Disentanglement +4

Structure Learning for Neural Module Networks

no code implementations WS 2019 Vardaan Pahuja, Jie Fu, Sarath Chandar, Christopher J. Pal

In current formulations of such networks only the parameters of the neural modules and/or the order of their execution is learned.

Question Answering Visual Question Answering

TIGS: An Inference Algorithm for Text Infilling with Gradient Search

1 code implementation ACL 2019 Dayiheng Liu, Jie Fu, PengFei Liu, Jiancheng Lv

Text infilling is defined as a task for filling in the missing part of a sentence or paragraph, which is suitable for many real-world natural language generation scenarios.

Sentence Text Infilling

Language Modeling with Graph Temporal Convolutional Networks

no code implementations ICLR 2019 Hongyin Luo, Yichen Li, Jie Fu, James Glass

Recently, there have been some attempts to use non-recurrent neural models for language modeling.

Language Modelling

Dataflow-based Joint Quantization of Weights and Activations for Deep Neural Networks

no code implementations4 Jan 2019 Xue Geng, Jie Fu, Bin Zhao, Jie Lin, Mohamed M. Sabry Aly, Christopher Pal, Vijay Chandrasekhar

This paper addresses a challenging problem - how to reduce energy consumption without incurring performance drop when deploying deep neural networks (DNNs) at the inference stage.

Quantization

Multi-task Learning over Graph Structures

no code implementations26 Nov 2018 Pengfei Liu, Jie Fu, Yue Dong, Xipeng Qiu, Jackie Chi Kit Cheung

We present two architectures for multi-task learning with neural sequence models.

General Classification Multi-Task Learning +2

Compositional planning in Markov decision processes: Temporal abstraction meets generalized logic composition

no code implementations5 Oct 2018 Xuan Liu, Jie Fu

Thus, a synthesis algorithm is developed to compute optimal policies efficiently by planning with primitive actions, policies for sub-tasks, and the compositions of sub-policies, for maximizing the probability of satisfying temporal logic specifications.

BFGAN: Backward and Forward Generative Adversarial Networks for Lexically Constrained Sentence Generation

no code implementations21 Jun 2018 Dayiheng Liu, Jie Fu, Qian Qu, Jiancheng Lv

Incorporating prior knowledge like lexical constraints into the model's output to generate meaningful and coherent sentences has many applications in dialogue system, machine translation, image captioning, etc.

Image Captioning Machine Translation +2

Human-in-the-Loop Synthesis for Partially Observable Markov Decision Processes

no code implementations27 Feb 2018 Steven Carr, Nils Jansen, Ralf Wimmer, Jie Fu, Ufuk Topcu

The efficient verification of this MC gives quantitative insights into the quality of the inferred human strategy by proving or disproving given system specifications.

Inverse Reinforce Learning with Nonparametric Behavior Clustering

no code implementations15 Dec 2017 Siddharthan Rajasekaran, Jinwei Zhang, Jie Fu

In this paper, we introduce the Non-parametric Behavior Clustering IRL algorithm to simultaneously cluster demonstrations and learn multiple reward functions from demonstrations that may be generated from more than one behaviors.

Clustering

Environment-Independent Task Specifications via GLTL

no code implementations14 Apr 2017 Michael L. Littman, Ufuk Topcu, Jie Fu, Charles Isbell, Min Wen, James Macglashan

We propose a new task-specification language for Markov decision processes that is designed to be an improvement over reward functions by being environment independent.

reinforcement-learning Reinforcement Learning (RL)

Adaptive Bidirectional Backpropagation: Towards Biologically Plausible Error Signal Transmission in Neural Networks

2 code implementations23 Feb 2017 Hongyin Luo, Jie Fu, James Glass

However, it has been argued that this is not biologically plausible because back-propagating error signals with the exact incoming weights are not considered possible in biological neural systems.

Importance sampling-based approximate optimal planning and control

no code implementations16 Dec 2016 Jie Fu

In this paper, we propose a sampling-based planning and optimal control method of nonlinear systems under non-differentiable constraints.

Systems and Control Robotics 70E60

Deep Q-Networks for Accelerating the Training of Deep Neural Networks

no code implementations5 Jun 2016 Jie Fu

With our approach, a deep RL agent (synonym for optimizer in this work) is used to automatically learn policies about how to schedule learning rates during the optimization of a DNN.

Reinforcement Learning (RL)

DrMAD: Distilling Reverse-Mode Automatic Differentiation for Optimizing Hyperparameters of Deep Neural Networks

1 code implementation5 Jan 2016 Jie Fu, Hongyin Luo, Jiashi Feng, Kian Hsiang Low, Tat-Seng Chua

The performance of deep neural networks is well-known to be sensitive to the setting of their hyperparameters.

Integrating active sensing into reactive synthesis with temporal logic constraints under partial observations

no code implementations1 Oct 2014 Jie Fu, Ufuk Topcu

We show that by alternating between the observation-based strategy and the active sensing strategy, under a mild technical assumption of the set of sensors in the system, the given temporal logic specification can be satisfied with probability 1.

Probably Approximately Correct MDP Learning and Control With Temporal Logic Constraints

no code implementations28 Apr 2014 Jie Fu, Ufuk Topcu

We model the interaction between the system and its environment as a Markov decision process (MDP) with initially unknown transition probabilities.

Cannot find the paper you are looking for? You can Submit a new open access paper.