Search Results for author: Jingwen Lu

Found 4 papers, 3 papers with code

LEAD: Liberal Feature-based Distillation for Dense Retrieval

1 code implementation • 10 Dec 2022 • Hao Sun, Xiao Liu, Yeyun Gong, Anlei Dong, Jingwen Lu, Yan Zhang, Linjun Yang, Rangan Majumder, Nan Duan

Knowledge distillation is often used to transfer knowledge from a strong teacher model to a relatively weak student model.

Document Ranking Knowledge Distillation +2

Paper
Code

SimANS: Simple Ambiguous Negatives Sampling for Dense Text Retrieval

1 code implementation • 21 Oct 2022 • Kun Zhou, Yeyun Gong, Xiao Liu, Wayne Xin Zhao, Yelong Shen, Anlei Dong, Jingwen Lu, Rangan Majumder, Ji-Rong Wen, Nan Duan, Weizhu Chen

Thus, we propose a simple ambiguous negatives sampling method, SimANS, which incorporates a new sampling probability distribution to sample more ambiguous negatives.

Retrieval Text Retrieval

Paper
Code

PROD: Progressive Distillation for Dense Retrieval

1 code implementation • 27 Sep 2022 • Zhenghao Lin, Yeyun Gong, Xiao Liu, Hang Zhang, Chen Lin, Anlei Dong, Jian Jiao, Jingwen Lu, Daxin Jiang, Rangan Majumder, Nan Duan

It is common that a better teacher model results in a bad student via distillation due to the nonnegligible gap between teacher and student.

Knowledge Distillation Natural Questions +1

Paper
Code

Aligning the Pretraining and Finetuning Objectives of Language Models

no code implementations • 5 Feb 2020 • Nuo Wang Pierse, Jingwen Lu

We found that, with objective alignment, our 768 by 3 and 512 by 3 transformer language models can reach accuracy of 83. 9%/82. 5% for concept-of-interest tagging and 73. 8%/70. 2% for acronym detection using only 200 finetuning examples per task, outperforming the 768 by 3 model pretrained without objective alignment by +4. 8%/+3. 4% and +9. 9%/+6. 3%.

Language Modelling

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.