1 code implementation • 25 Mar 2024 • Ziwei Chai, Guoyin Wang, Jing Su, Tianjie Zhang, Xuanwen Huang, Xuwu Wang, Jingjing Xu, Jianbo Yuan, Hongxia Yang, Fei Wu, Yang Yang
We present Expert-Token-Routing, a unified generalist framework that facilitates seamless integration of multiple expert LLMs.
1 code implementation • 24 Feb 2024 • Haiteng Zhao, Chang Ma, Guoyin Wang, Jing Su, Lingpeng Kong, Jingjing Xu, Zhi-Hong Deng, Hongxia Yang
Large Language Model (LLM) Agents have recently garnered increasing interest yet they are limited in their ability to learn from trial and error, a key element of intelligent behavior.
no code implementations • 29 Jan 2024 • Zhenyu He, Guhao Feng, Shengjie Luo, Kai Yang, Di He, Jingjing Xu, Zhi Zhang, Hongxia Yang, LiWei Wang
In this work, we leverage the intrinsic segmentation of language sequences and design a new positional encoding method called Bilevel Positional Encoding (BiPE).
1 code implementation • 10 Jan 2024 • Xueyu Hu, Ziyu Zhao, Shuang Wei, Ziwei Chai, Qianli Ma, Guoyin Wang, Xuwu Wang, Jing Su, Jingjing Xu, Ming Zhu, Yao Cheng, Jianbo Yuan, Jiwei Li, Kun Kuang, Yang Yang, Hongxia Yang, Fei Wu
In this paper, we introduce InfiAgent-DABench, the first benchmark specifically designed to evaluate LLM-based agents on data analysis tasks.
1 code implementation • 30 Dec 2023 • Jingjing Xu, Caesar Wu, Yuan-Fang Li, Pascal Bouvry
From the model perspective, one of the PCA-enhanced models: PCA+Crossformer, reduces mean square errors (MSE) by 33. 3% and decreases runtime by 49. 2% on average.
no code implementations • 21 Nov 2023 • Caesar Wu, Yuan-Fang Li, Jian Li, Jingjing Xu, Bouvry Pascal
We aim to use this framework to conduct the TAI experiments by quantitive and qualitative research methods to satisfy TAI properties for the decision-making context.
no code implementations • 12 Sep 2023 • Di Guo, Sijin Li, Jun Liu, Zhangren Tu, Tianyu Qiu, Jingjing Xu, Liubin Feng, Donghai Lin, Qing Hong, Meijin Lin, Yanqin Lin, Xiaobo Qu
Particularly, the emerging deep learning tools is hard to be widely used in NMR due to the sophisticated setup of computation.
2 code implementations • 9 Aug 2023 • Wenhao Zhu, Yunzhe Lv, Qingxiu Dong, Fei Yuan, Jingjing Xu, ShuJian Huang, Lingpeng Kong, Jiajun Chen, Lei LI
We start from targeting individual languages by performing cross-lingual instruction-tuning (CoIT) on LLaMA, i. e. tuning it with translation task data and cross-lingual general task data to obtain cross-lingual models (x-LLaMAs), and formulate underlying scaling laws to investigate the advantages of using scalable translation data.
1 code implementation • 10 Jun 2023 • Wenhao Zhu, Jingjing Xu, ShuJian Huang, Lingpeng Kong, Jiajun Chen
We propose an effective training framework INK to directly smooth the representation space via adjusting representations of kNN neighbors with a small number of new parameters.
no code implementations • 7 Jun 2023 • Lei LI, Yuwei Yin, Shicheng Li, Liang Chen, Peiyi Wang, Shuhuai Ren, Mukai Li, Yazheng Yang, Jingjing Xu, Xu sun, Lingpeng Kong, Qi Liu
To tackle this challenge and promote research in the vision-language field, we introduce the Multi-Modal, Multilingual Instruction Tuning (M$^3$IT) dataset, designed to optimize VLM alignment with human instructions.
1 code implementation • 24 May 2023 • Heming Xia, Qingxiu Dong, Lei LI, Jingjing Xu, Tianyu Liu, Ziwei Qin, Zhifang Sui
Recently, Large Language Models (LLMs) have been serving as general-purpose interfaces, posing a significant demand for comprehensive visual knowledge.
1 code implementation • 23 May 2023 • Lei LI, Jingjing Xu, Qingxiu Dong, Ce Zheng, Qi Liu, Lingpeng Kong, Xu sun
Language models~(LMs) gradually become general-purpose interfaces in the interactive and embodied world, where the understanding of physical concepts is an essential prerequisite.
2 code implementations • 22 May 2023 • Ce Zheng, Lei LI, Qingxiu Dong, Yuxuan Fan, Zhiyong Wu, Jingjing Xu, Baobao Chang
Inspired by in-context learning (ICL), a new paradigm based on demonstration contexts without parameter updating, we explore whether ICL can edit factual knowledge.
no code implementations • 22 May 2023 • Bohong Wu, Fei Yuan, Hai Zhao, Lei LI, Jingjing Xu
Considering that encoder-based models have the advantage of efficient generation and self-correction abilities, this paper explores methods to empower multilingual understanding models the generation abilities to get a unified model.
2 code implementations • 10 Apr 2023 • Wenhao Zhu, Hongyi Liu, Qingxiu Dong, Jingjing Xu, ShuJian Huang, Lingpeng Kong, Jiajun Chen, Lei LI
Large language models (LLMs) have demonstrated remarkable potential in handling multilingual machine translation (MMT).
1 code implementation • 7 Mar 2023 • Yudong Wang, Chang Ma, Qingxiu Dong, Lingpeng Kong, Jingjing Xu
Experiments on a wide range of models show that neural networks, even pre-trained language models, have sharp performance drops on our benchmark, demonstrating the effectiveness on evaluating the weaknesses of neural networks.
3 code implementations • 6 Mar 2023 • Zhenyu Wu, Yaoxiang Wang, Jiacheng Ye, Jiangtao Feng, Jingjing Xu, Yu Qiao, Zhiyong Wu
However, the implementation of ICL is sophisticated due to the diverse retrieval and inference methods involved, as well as the varying pre-processing requirements for different models, datasets, and tasks.
no code implementations • 11 Jan 2023 • Christoph Lüscher, Jingjing Xu, Mohammad Zeineldeen, Ralf Schlüter, Hermann Ney
By further adding neural speaker embeddings, we gain additional ~3% relative WER improvement on Hub5'00.
1 code implementation • 31 Dec 2022 • Qingxiu Dong, Damai Dai, Ce Zheng, Zhiyong Wu, Baobao Chang, Xu sun, Jingjing Xu, Lei LI, Zhifang Sui
With the increasing ability of large language models (LLMs), in-context learning (ICL) has become a new paradigm for natural language processing (NLP), where LLMs make predictions only based on contexts augmented with a few examples.
no code implementations • 20 Dec 2022 • Jingjing Xu, Qingxiu Dong, Hongyi Liu, Lei LI
With increasing scale, large language models demonstrate both quantitative improvement and new qualitative capabilities, especially as zero-shot learners, like GPT-3.
1 code implementation • 20 Dec 2022 • Fei Yuan, Yinquan Lu, Wenhao Zhu, Lingpeng Kong, Lei LI, Yu Qiao, Jingjing Xu
To address the needs of learning representations for all languages in a unified space, we propose a novel efficient training recipe, upon which we build an effective detachable model, Lego-MT.
no code implementations • 12 Dec 2022 • Jingjing Xu, Maria Biryukov, Martin Theobald, Vinu Ellampallil Venugopal
Answering complex questions over textual resources remains a challenge, particularly when dealing with nuanced relationships between multiple entities expressed within natural-language sentences.
no code implementations • 11 Nov 2022 • Wei Zhou, Haotian Wu, Jingjing Xu, Mohammad Zeineldeen, Christoph Lüscher, Ralf Schlüter, Hermann Ney
Detailed analysis and experimental verification are conducted to show the optimal positions in the ASR neural network (NN) to apply speaker enhancing and adversarial training.
1 code implementation • 7 Oct 2022 • Qingxiu Dong, Damai Dai, YiFan Song, Jingjing Xu, Zhifang Sui, Lei LI
However, we find that facts stored in the PLMs are not always correct.
no code implementations • 26 Jun 2022 • Mohammad Zeineldeen, Jingjing Xu, Christoph Lüscher, Ralf Schlüter, Hermann Ney
In this work, we investigate various methods for speaker adaptive training (SAT) based on feature-space approaches for a conformer-based acoustic model (AM) on the Switchboard 300h dataset.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
1 code implementation • ACL 2022 • Zhiyi Fu, Wangchunshu Zhou, Jingjing Xu, Hao Zhou, Lei LI
How do masked language models (MLMs) such as BERT learn contextual representations?
1 code implementation • 26 Nov 2021 • Jingjing Xu, Liang Zhao, Junyang Lin, Rundong Gao, Xu sun, Hongxia Yang
Many existing neural architecture search (NAS) solutions rely on downstream training for architecture evaluation, which takes enormous computations.
no code implementations • 8 Nov 2021 • Jingjing Xu, Wangchunshu Zhou, Zhiyi Fu, Hao Zhou, Lei LI
In recent years, larger and deeper models are springing up and continuously pushing state-of-the-art (SOTA) results across various fields like natural language processing (NLP) and computer vision (CV).
no code implementations • 5 Nov 2021 • Mohammad Zeineldeen, Jingjing Xu, Christoph Lüscher, Wilfried Michel, Alexander Gerstenberger, Ralf Schlüter, Hermann Ney
The recently proposed conformer architecture has been successfully used for end-to-end automatic speech recognition (ASR) architectures achieving state-of-the-art performance on different datasets.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • ICLR 2022 • Zhenqiao Song, Hao Zhou, Lihua Qian, Jingjing Xu, Shanbo Cheng, Mingxuan Wang, Lei LI
Multilingual machine translation aims to develop a single model for multiple language directions.
1 code implementation • Findings (NAACL) 2022 • Yiran Chen, Zhenqiao Song, Xianze Wu, Danqing Wang, Jingjing Xu, Jiaze Chen, Hao Zhou, Lei LI
We introduce MTG, a new benchmark suite for training and evaluating multilingual text generation.
1 code implementation • NeurIPS 2021 • Zaixiang Zheng, Hao Zhou, ShuJian Huang, Jiajun Chen, Jingjing Xu, Lei LI
Thus REDER enables reversible machine translation by simply flipping the input and output ends.
no code implementations • 1 Jan 2021 • Jingjing Xu, Liang Zhao, Junyang Lin, Xu sun, Hongxia Yang
Inspired by our new finding, we explore a simple yet effective network architecture search (NAS) approach that leverages gradient correlation and gradient values to find well-performing architectures.
no code implementations • 1 Jan 2021 • Jingjing Xu, Hao Zhou, Chun Gan, Zaixiang Zheng, Lei LI
In this paper, we find an exciting relation between an information-theoretic feature and the performance of NLP tasks such as machine translation with a given vocabulary.
1 code implementation • ACL 2021 • Jingjing Xu, Hao Zhou, Chun Gan, Zaixiang Zheng, Lei LI
The choice of token vocabulary affects the performance of machine translation.
no code implementations • 28 Sep 2020 • Liang Zhao, Jingjing Xu, Junyang Lin, Yichang Zhang, Hongxia Yang, Xu sun
The reasoning module is responsible for searching skeleton paths from a knowledge graph to imitate the imagination process in the human writing for semantic transfer.
2 code implementations • 17 Nov 2019 • Guangxiang Zhao, Xu sun, Jingjing Xu, Zhiyuan Zhang, Liangchen Luo
In this work, we explore parallel multi-scale representation learning on sequence data, striving to capture both long-range and short-range language structures.
Ranked #8 on Machine Translation on WMT2014 English-French
1 code implementation • NeurIPS 2019 • Jingjing Xu, Xu sun, Zhiyuan Zhang, Guangxiang Zhao, Junyang Lin
Unlike them, we find that the derivatives of the mean and variance are more important than forward normalization by re-centering and re-scaling backward gradients.
Ranked #5 on Machine Translation on IWSLT2015 English-Vietnamese
no code implementations • IJCNLP 2019 • Jingjing Xu, Yuechen Wang, Duyu Tang, Nan Duan, Pengcheng Yang, Qi Zeng, Ming Zhou, Xu sun
We provide representative baselines for these tasks and further introduce a coarse-to-fine model for clarification question generation.
no code implementations • IJCNLP 2019 • Pengcheng Yang, Junyang Lin, Jingjing Xu, Jun Xie, Qi Su, Xu sun
The task of unsupervised sentiment modification aims to reverse the sentiment polarity of the input text while preserving its semantic content without any parallel data.
no code implementations • IJCNLP 2019 • Jingjing Xu, Liang Zhao, Hanqi Yan, Qi Zeng, Yun Liang, Xu sun
The generator learns to generate examples to attack the classifier while the classifier learns to defend these attacks.
1 code implementation • International Conference on Computer Vision Workshops 2019 • Dawei Du, Pengfei Zhu, Longyin Wen, Xiao Bian, Haibin Lin, QinGhua Hu, Tao Peng, Jiayu Zheng, Xinyao Wang, Yue Zhang, Liefeng Bo, Hailin Shi, Rui Zhu, Aashish Kumar, Aijin Li, Almaz Zinollayev, Anuar Askergaliyev, Arne Schumann, Binjie Mao, Byeongwon Lee, Chang Liu, Changrui Chen, Chunhong Pan, Chunlei Huo, Da Yu, Dechun Cong, Dening Zeng, Dheeraj Reddy Pailla, Di Li, Dong Wang, Donghyeon Cho, Dongyu Zhang, Furui Bai, George Jose, Guangyu Gao, Guizhong Liu, Haitao Xiong, Hao Qi, Haoran Wang, Heqian Qiu, Hongliang Li, Huchuan Lu, Ildoo Kim, Jaekyum Kim, Jane Shen, Jihoon Lee, Jing Ge, Jingjing Xu, Jingkai Zhou, Jonas Meier, Jun Won Choi, Junhao Hu, Junyi Zhang, Junying Huang, Kaiqi Huang, Keyang Wang, Lars Sommer, Lei Jin, Lei Zhang
Results of 33 object detection algorithms are presented.
no code implementations • ACL 2020 • Wanjun Zhong, Jingjing Xu, Duyu Tang, Zenan Xu, Nan Duan, Ming Zhou, Jiahai Wang, Jian Yin
We evaluate our system on FEVER, a benchmark dataset for fact checking, and find that rich structural information is helpful and both our graph-based mechanisms improve the accuracy.
Ranked #2 on Fact Verification on FEVER
1 code implementation • 9 Sep 2019 • Shangwen Lv, Daya Guo, Jingjing Xu, Duyu Tang, Nan Duan, Ming Gong, Linjun Shou, Daxin Jiang, Guihong Cao, Songlin Hu
In this work, we propose to automatically extract evidence from heterogeneous knowledge sources, and answer questions based on the extracted evidence.
Ranked #13 on Common Sense Reasoning on CommonsenseQA
1 code implementation • ACL 2019 • Wei Li, Jingjing Xu, Yancheng He, ShengLi Yan, Yunfang Wu, Xu sun
In this paper, we propose to generate comments with a graph-to-sequence model that models the input news as a topic interaction graph.
4 code implementations • 27 Jun 2019 • Ruixuan Luo, Jingjing Xu, Yi Zhang, Zhiyuan Zhang, Xuancheng Ren, Xu sun
Through this method, we generate synthetic data using a large amount of unlabeled data in the target domain and then obtain a word segmentation model for the target domain.
1 code implementation • 4 Jun 2019 • Wei Li, Jingjing Xu, Yancheng He, ShengLi Yan, Yunfang Wu, Xu sun
In this paper, we propose to generate comments with a graph-to-sequence model that models the input news as a topic interaction graph.
no code implementations • 1 Nov 2018 • Pengcheng Yang, Fuli Luo, Shuangzhi Wu, Jingjing Xu, Dong-dong Zhang, Xu sun
In order to avoid such sophisticated alternate optimization, we propose to learn unsupervised word mapping by directly maximizing the mean discrepancy between the distribution of transferred embedding and target embedding.
1 code implementation • EMNLP 2018 • Jingjing Xu, Xuancheng Ren, Junyang Lin, Xu sun
Existing text generation methods tend to produce repeated and {''}boring{''} expressions.
no code implementations • 12 Sep 2018 • Yibo Sun, Duyu Tang, Nan Duan, Jingjing Xu, Xiaocheng Feng, Bing Qin
Results show that our knowledge-aware model outperforms the state-of-the-art approaches.
no code implementations • 11 Sep 2018 • Shu Liu, Jingjing Xu, Xuancheng Ren, Xu sun
To evaluate the effectiveness of the proposed model, we build a large-scale rationality evaluation dataset.
1 code implementation • EMNLP 2018 • Liangchen Luo, Jingjing Xu, Junyang Lin, Qi Zeng, Xu sun
Different from conventional text generation tasks, the mapping between inputs and responses in conversations is more complicated, which highly demands the understanding of utterance-level semantic dependency, a relation between the whole meanings of inputs and outputs.
Ranked #1 on Text Generation on DailyDialog
1 code implementation • NAACL 2019 • Guangxiang Zhao, Jingjing Xu, Qi Zeng, Xuancheng Ren
This task requires the system to identify multiple styles of music based on its reviews on websites.
1 code implementation • EMNLP 2018 • Yi Zhang, Jingjing Xu, Pengcheng Yang, Xu sun
The task of sentiment modification requires reversing the sentiment of the input and preserving the sentiment-independent content.
1 code implementation • EMNLP 2018 • Jingjing Xu, Xuancheng Ren, Yi Zhang, Qi Zeng, Xiaoyan Cai, Xu sun
Compared to the state-of-the-art models, our skeleton-based model can generate significantly more coherent text according to human evaluation and automatic evaluation.
no code implementations • 14 Aug 2018 • Zhiyuan Zhang, Wei Li, Jingjing Xu, Xu sun
We define the primal meaning of an expression to be a frequently used sense of that expression from which its other frequent senses can be deduced.
1 code implementation • ACL 2018 • Jingjing Xu, Xu sun, Qi Zeng, Xuancheng Ren, Xiaodong Zhang, Houfeng Wang, Wenjie Li
We evaluate our approach on two review datasets, Yelp and Amazon.
Ranked #6 on Unsupervised Text Style Transfer on Yelp
3 code implementations • 5 Feb 2018 • Jingjing Xu, Xuancheng Ren, Junyang Lin, Xu sun
Existing text generation methods tend to produce repeated and "boring" expressions.
2 code implementations • 19 Nov 2017 • Jingjing Xu, Ji Wen, Xu sun, Qi Su
To build a high quality dataset, we propose two tagging methods to solve the problem of data inconsistency, including a heuristic tagging method and a machine auxiliary tagging method.
3 code implementations • 17 Nov 2017 • Xu Sun, Xuancheng Ren, Shuming Ma, Bingzhen Wei, Wei Li, Jingjing Xu, Houfeng Wang, Yi Zhang
Based on the sparsified gradients, we further simplify the model by eliminating the rows or columns that are seldom updated, which will reduce the computational cost both in the training and decoding, and potentially accelerate decoding in real-world applications.
no code implementations • 4 Nov 2017 • Jingjing Xu, Xu sun, Sujian Li, Xiaoyan Cai, Bingzhen Wei
In this paper, we propose a deep stacking framework to improve the performance on word segmentation tasks with insufficient data by integrating datasets from diverse domains.
no code implementations • 31 Oct 2017 • Jingjing Xu
The head-based representation of the PDTB is adopted in the arguments identifier, which turns the problem of indentifying the arguments of discourse connective into finding the head and end of the arguments.
no code implementations • 31 Oct 2017 • Jingjing Xu
Recently, encoder-decoder models are widely used in social media text summarization.
no code implementations • 18 Sep 2017 • Bingzhen Wei, Xu sun, Xuancheng Ren, Jingjing Xu
As traditional neural network consumes a significant amount of computing resources during back propagation, \citet{Sun2017mePropSB} propose a simple yet effective technique to alleviate this problem.
1 code implementation • ACL 2017 • Shuming Ma, Xu sun, Jingjing Xu, Houfeng Wang, Wenjie Li, Qi Su
In this work, our goal is to improve semantic relevance between source texts and summaries for Chinese social media summarization.
no code implementations • 15 Feb 2017 • Jingjing Xu, Xu sun
First, we train a teacher model on high-resource corpora and then use the learned knowledge to initialize a student model.