Search Results for author: Moshe Wasserblat

Found 22 papers, 9 papers with code

Exploring the Boundaries of Low-Resource BERT Distillation

no code implementations • EMNLP (sustainlp) 2020 • Moshe Wasserblat, Oren Pereg, Peter Izsak

We also show that the distillation of large pre-trained models is more effective in real-life scenarios where limited amounts of labeled training are available.

Model Compression

Paper
Add Code

Opinion-based Relational Pivoting for Cross-domain Aspect Term Extraction

no code implementations • WASSA (ACL) 2022 • Ayal Klein, Oren Pereg, Daniel Korat, Vasudev Lal, Moshe Wasserblat, Ido Dagan

In this paper, we investigate and establish empirically a prior conjecture, which suggests that the linguistic relations connecting opinion terms to their aspects transfer well across domains and therefore can be leveraged for cross-domain aspect term extraction.

Domain Adaptation Term Extraction

Paper
Add Code

Accelerating Speculative Decoding using Dynamic Speculation Length

no code implementations • 7 May 2024 • Jonathan Mamou, Oren Pereg, Daniel Korat, Moshe Berchansky, Nadav Timor, Moshe Wasserblat, Roy Schwartz

Speculative decoding is a promising method for reducing the inference latency of large language models.

Paper
Add Code

CoTAR: Chain-of-Thought Attribution Reasoning with Multi-level Granularity

no code implementations • 16 Apr 2024 • Moshe Berchansky, Daniel Fleischer, Moshe Wasserblat, Peter Izsak

This approach focuses the reasoning process on generating an attribution-centric output.

Question Answering

Paper
Add Code

Optimizing Retrieval-augmented Reader Models via Token Elimination

1 code implementation • 20 Oct 2023 • Moshe Berchansky, Peter Izsak, Avi Caciularu, Ido Dagan, Moshe Wasserblat

Fusion-in-Decoder (FiD) is an effective retrieval-augmented language model applied across a variety of open-domain tasks, such as question answering, fact checking, etc.

Answer Generation Decoder +4

Paper
Code

An Efficient Sparse Inference Software Accelerator for Transformer-based Language Models on CPUs

1 code implementation • 28 Jun 2023 • Haihao Shen, Hengyu Meng, Bo Dong, Zhe Wang, Ofir Zafrir, Yi Ding, Yu Luo, Hanwen Chang, Qun Gao, Ziheng Wang, Guy Boudoukh, Moshe Wasserblat

We apply our sparse accelerator on widely-used Transformer-based language models including Bert-Mini, DistilBERT, Bert-Base, and BERT-Large.

Model Compression

1,979

Paper
Code

QuaLA-MiniLM: a Quantized Length Adaptive MiniLM

2 code implementations • 31 Oct 2022 • Shira Guskin, Moshe Wasserblat, Chang Wang, Haihao Shen

Our quantized length-adaptive MiniLM model (QuaLA-MiniLM) is trained only once, dynamically fits any inference scenario, and achieves an accuracy-efficiency trade-off superior to any other efficient approaches per any computational budget on the SQuAD1. 1 dataset (up to x8. 8 speedup with <1% accuracy loss).

Computational Efficiency Knowledge Distillation +2

2,000

Paper
Code

Fast DistilBERT on CPUs

1 code implementation • 27 Oct 2022 • Haihao Shen, Ofir Zafrir, Bo Dong, Hengyu Meng, Xinyu Ye, Zhe Wang, Yi Ding, Hanwen Chang, Guy Boudoukh, Moshe Wasserblat

In this work, we propose a new pipeline for creating and running Fast Transformer models on CPUs, utilizing hardware-aware pruning, knowledge distillation, quantization, and our own Transformer inference runtime engine with optimized kernels for sparse and quantized operators.

Knowledge Distillation Model Compression +2

1,979

Paper
Code

Cross-Domain Aspect Extraction using Transformers Augmented with Knowledge Graphs

1 code implementation • 18 Oct 2022 • Phillip Howard, Arden Ma, Vasudev Lal, Ana Paula Simoes, Daniel Korat, Oren Pereg, Moshe Wasserblat, Gadi Singer

The extraction of aspect terms is a critical step in fine-grained sentiment analysis of text.

Aspect Extraction Knowledge Graphs +2

2,931

Paper
Code

Efficient Few-Shot Learning Without Prompts

1 code implementation • 22 Sep 2022 • Lewis Tunstall, Nils Reimers, Unso Eun Seo Jo, Luke Bates, Daniel Korat, Moshe Wasserblat, Oren Pereg

This simple framework requires no prompts or verbalizers, and achieves high accuracy with orders of magnitude less parameters than existing techniques.

Few-Shot Learning Few-Shot Text Classification +1

2,018

Paper
Code

TangoBERT: Reducing Inference Cost by using Cascaded Architecture

no code implementations • 13 Apr 2022 • Jonathan Mamou, Oren Pereg, Moshe Wasserblat, Roy Schwartz

In order to reduce this computational load in inference time, we present TangoBERT, a cascaded model architecture in which instances are first processed by an efficient but less accurate first tier model, and only part of those instances are additionally processed by a less efficient but more accurate second tier model.

Reading Comprehension SST-2 +2

Paper
Add Code

Dynamic-TinyBERT: Boost TinyBERT's Inference Efficiency by Dynamic Sequence Length

no code implementations • 18 Nov 2021 • Shira Guskin, Moshe Wasserblat, Ke Ding, Gyuwan Kim

Additionally, a separate model must be trained for each inference scenario with its distinct computational budget.

Computational Efficiency Hyperparameter Optimization +1

Paper
Add Code

Prune Once for All: Sparse Pre-Trained Language Models

2 code implementations • 10 Nov 2021 • Ofir Zafrir, Ariel Larey, Guy Boudoukh, Haihao Shen, Moshe Wasserblat

We show how the compressed sparse pre-trained models we trained transfer their knowledge to five different downstream natural language tasks with minimal accuracy loss.

Ranked #2 on Natural Language Inference on MultiNLI Dev

Natural Language Inference Quantization +3

1,979

Paper
Code

InterpreT: An Interactive Visualization Tool for Interpreting Transformers

no code implementations • EACL 2021 • Vasudev Lal, Arden Ma, Estelle Aflalo, Phillip Howard, Ana Simoes, Daniel Korat, Oren Pereg, Gadi Singer, Moshe Wasserblat

With the increasingly widespread use of Transformer-based models for NLU/NLP tasks, there is growing interest in understanding the inner workings of these models, why they are so effective at a wide range of tasks, and how they can be further tuned and improved.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA)

Paper
Add Code

Syntactically Aware Cross-Domain Aspect and Opinion Terms Extraction

no code implementations • COLING 2020 • Oren Pereg, Daniel Korat, Moshe Wasserblat

A fundamental task of fine-grained sentiment analysis is aspect and opinion terms extraction.

Sentiment Analysis Unsupervised Domain Adaptation

Paper
Add Code

Training Compact Models for Low Resource Entity Tagging using Pre-trained Language Models

no code implementations • 14 Oct 2019 • Peter Izsak, Shira Guskin, Moshe Wasserblat

In this work-in-progress we combined the effectiveness of transfer learning provided by pre-trained masked language models with a semi-supervised approach to train a fast and compact model using labeled and unlabeled examples.

Language Modelling Low Resource Named Entity Recognition +4

Paper
Add Code

Q8BERT: Quantized 8Bit BERT

5 code implementations • 14 Oct 2019 • Ofir Zafrir, Guy Boudoukh, Peter Izsak, Moshe Wasserblat

Recently, pre-trained Transformer based language models such as BERT and GPT, have shown great improvement in many Natural Language Processing (NLP) tasks.

Ranked #13 on Semantic Textual Similarity on STS Benchmark

Linguistic Acceptability Natural Language Inference +3

2,931

Paper
Code

ABSApp: A Portable Weakly-Supervised Aspect-Based Sentiment Extraction System

1 code implementation • IJCNLP 2019 • Oren Pereg, Daniel Korat, Moshe Wasserblat, Jonathan Mamou, Ido Dagan

We present ABSApp, a portable system for weakly-supervised aspect-based sentiment extraction.

2,915

Paper
Code

Multi-Context Term Embeddings: the Use Case of Corpus-based Term Set Expansion

no code implementations • WS 2019 • Jonathan Mamou, Oren Pereg, Moshe Wasserblat, Ido Dagan

In this paper, we present a novel algorithm that combines multi-context term embeddings using a neural classifier and we test this approach on the use case of corpus-based term set expansion.

Paper
Add Code

Term Set Expansion based NLP Architect by Intel AI Lab

no code implementations • EMNLP 2018 • Jonathan Mamou, Oren Pereg, Moshe Wasserblat, Alon Eirew, Yael Green, Shira Guskin, Peter Izsak, Daniel Korat

We present SetExpander, a corpus-based system for expanding a seed set of terms into amore complete set of terms that belong to the same semantic class.

Paper
Add Code

SetExpander: End-to-end Term Set Expansion Based on Multi-Context Term Embeddings

no code implementations • COLING 2018 • Jonathan Mamou, Oren Pereg, Moshe Wasserblat, Ido Dagan, Yoav Goldberg, Alon Eirew, Yael Green, Shira Guskin, Peter Izsak, Daniel Korat

We present SetExpander, a corpus-based system for expanding a seed set of terms into a more complete set of terms that belong to the same semantic class.

Relation Extraction Semantic Textual Similarity

Paper
Add Code

Term Set Expansion based on Multi-Context Term Embeddings: an End-to-end Workflow

no code implementations • 26 Jul 2018 • Jonathan Mamou, Oren Pereg, Moshe Wasserblat, Ido Dagan, Yoav Goldberg, Alon Eirew, Yael Green, Shira Guskin, Peter Izsak, Daniel Korat

We present SetExpander, a corpus-based system for expanding a seed set of terms into a more complete set of terms that belong to the same semantic class.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.