Search Results for author: Binxing Jiao

Found 14 papers, 7 papers with code

Are You Copying My Model? Protecting the Copyright of Large Language Models for EaaS via Backdoor Watermark

1 code implementation • 17 May 2023 • Wenjun Peng, Jingwei Yi, Fangzhao Wu, Shangxi Wu, Bin Zhu, Lingjuan Lyu, Binxing Jiao, Tong Xu, Guangzhong Sun, Xing Xie

Companies have begun to offer Embedding as a Service (EaaS) based on these LLMs, which can benefit various natural language processing (NLP) tasks for customers.

Model extraction

Paper
Code

Inference with Reference: Lossless Acceleration of Large Language Models

1 code implementation • 10 Apr 2023 • Nan Yang, Tao Ge, Liang Wang, Binxing Jiao, Daxin Jiang, Linjun Yang, Rangan Majumder, Furu Wei

We propose LLMA, an LLM accelerator to losslessly speed up Large Language Model (LLM) inference with references.

Language Modelling Large Language Model

3,179

Paper
Code

Adam: Dense Retrieval Distillation with Adaptive Dark Examples

no code implementations • 20 Dec 2022 • Chang Liu, Chongyang Tao, Xiubo Geng, Tao Shen, Dongyan Zhao, Can Xu, Binxing Jiao, Daxin Jiang

Different from previous works that only rely on one positive and hard negatives as candidate passages, we create dark examples that all have moderate relevance to the query through mixing-up and masking in discrete space.

Knowledge Distillation Retrieval

Paper
Add Code

Text Embeddings by Weakly-Supervised Contrastive Pre-training

1 code implementation • 7 Dec 2022 • Liang Wang, Nan Yang, Xiaolong Huang, Binxing Jiao, Linjun Yang, Daxin Jiang, Rangan Majumder, Furu Wei

This paper presents E5, a family of state-of-the-art text embeddings that transfer well to a wide range of tasks.

Ranked #11 on Only Connect Walls Dataset Task 1 (Grouping) on OCW (using extra training data)

Only Connect Walls Dataset Task 1 (Grouping) Retrieval

18,343

Paper
Code

VATLM: Visual-Audio-Text Pre-Training with Unified Masked Prediction for Speech Representation Learning

no code implementations • 21 Nov 2022 • Qiushi Zhu, Long Zhou, Ziqiang Zhang, Shujie Liu, Binxing Jiao, Jie Zhang, LiRong Dai, Daxin Jiang, Jinyu Li, Furu Wei

Although speech is a simple and effective way for humans to communicate with the outside world, a more realistic speech interaction contains multimodal information, e. g., vision, text.

Audio-Visual Speech Recognition Language Modelling +3

Paper
Add Code

Effective and Efficient Query-aware Snippet Extraction for Web Search

1 code implementation • 17 Oct 2022 • Jingwei Yi, Fangzhao Wu, Chuhan Wu, Xiaolong Huang, Binxing Jiao, Guangzhong Sun, Xing Xie

In this paper, we propose an effective query-aware webpage snippet extraction method named DeepQSE, aiming to select a few sentences which can best summarize the webpage content in the context of input query.

Sentence

Paper
Code

LexMAE: Lexicon-Bottlenecked Pretraining for Large-Scale Retrieval

1 code implementation • 31 Aug 2022 • Tao Shen, Xiubo Geng, Chongyang Tao, Can Xu, Xiaolong Huang, Binxing Jiao, Linjun Yang, Daxin Jiang

In large-scale retrieval, the lexicon-weighting paradigm, learning weighted sparse representations in vocabulary space, has shown promising results with high quality and low latency.

Language Modelling Passage Retrieval +1

Paper
Code

LED: Lexicon-Enlightened Dense Retriever for Large-Scale Retrieval

1 code implementation • 29 Aug 2022 • Kai Zhang, Chongyang Tao, Tao Shen, Can Xu, Xiubo Geng, Binxing Jiao, Daxin Jiang

The alignment is achieved by weakened knowledge distillations to enlighten the retriever via two aspects -- 1) a lexicon-augmented contrastive objective to challenge the dense encoder and 2) a pair-wise rank-consistent regularization to make dense model's behavior incline to the other.

Representation Learning Retrieval

Paper
Code

SimLM: Pre-training with Representation Bottleneck for Dense Passage Retrieval

1 code implementation • 6 Jul 2022 • Liang Wang, Nan Yang, Xiaolong Huang, Binxing Jiao, Linjun Yang, Daxin Jiang, Rangan Majumder, Furu Wei

It employs a simple bottleneck architecture that learns to compress the passage information into a dense vector through self-supervised pre-training.

Language Modelling Passage Retrieval +1

18,343

Paper
Code

Towards Robust Ranker for Text Retrieval

no code implementations • 16 Jun 2022 • Yucheng Zhou, Tao Shen, Xiubo Geng, Chongyang Tao, Can Xu, Guodong Long, Binxing Jiao, Daxin Jiang

A ranker plays an indispensable role in the de facto 'retrieval & rerank' pipeline, but its training still lags behind -- learning from moderate negatives or/and serving as an auxiliary module for a retriever.

Passage Retrieval Retrieval +1

Paper
Add Code

Task-Specific Expert Pruning for Sparse Mixture-of-Experts

no code implementations • 1 Jun 2022 • Tianyu Chen, Shaohan Huang, Yuan Xie, Binxing Jiao, Daxin Jiang, Haoyi Zhou, JianXin Li, Furu Wei

The sparse Mixture-of-Experts (MoE) model is powerful for large-scale pre-training and has achieved promising results due to its model capacity.

Paper
Add Code

THE-X: Privacy-Preserving Transformer Inference with Homomorphic Encryption

no code implementations • Findings (ACL) 2022 • Tianyu Chen, Hangbo Bao, Shaohan Huang, Li Dong, Binxing Jiao, Daxin Jiang, Haoyi Zhou, JianXin Li, Furu Wei

As more and more pre-trained language models adopt on-cloud deployment, the privacy issues grow quickly, mainly for the exposure of plain-text user data (e. g., search history, medical record, bank account).

Privacy Preserving

Paper
Add Code

Smart Bird: Learnable Sparse Attention for Efficient and Effective Transformer

no code implementations • 20 Aug 2021 • Chuhan Wu, Fangzhao Wu, Tao Qi, Binxing Jiao, Daxin Jiang, Yongfeng Huang, Xing Xie

We then sample token pairs based on their probability scores derived from the sketched attention matrix to generate different sparse attention index matrices for different attention heads.

Paper
Add Code

xMoCo: Cross Momentum Contrastive Learning for Open-Domain Question Answering

no code implementations • ACL 2021 • Nan Yang, Furu Wei, Binxing Jiao, Daxing Jiang, Linjun Yang

Dense passage retrieval has been shown to be an effective approach for information retrieval tasks such as open domain question answering.

Contrastive Learning Open-Domain Question Answering +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.