Search Results for author: Mohammadreza Banaei

Found 7 papers, 6 papers with code

LoRA-XS: Low-Rank Adaptation with Extremely Small Number of Parameters

1 code implementation • 27 May 2024 • Klaudia Bałazy, Mohammadreza Banaei, Karl Aberer, Jacek Tabor

The recent trend in scaling language models has led to a growing demand for parameter-efficient tuning (PEFT) methods such as LoRA (Low-Rank Adaptation).

Benchmarking GSM8K +1

Paper
Code

Breaking the Language Barrier: Improving Cross-Lingual Reasoning with Structured Self-Attention

1 code implementation • 23 Oct 2023 • Negar Foroutan, Mohammadreza Banaei, Karl Aberer, Antoine Bosselut

We evaluate the cross-lingual reasoning abilities of MultiLMs in two schemes: (1) where the language of the context and the question remain the same in the new languages that are tested (i. e., the reasoning is still monolingual, but the model must transfer the learned reasoning ability across languages), and (2) where the language of the context and the question is different (which we term code-switched reasoning).

Logical Reasoning

Paper
Code

Revisiting Offline Compression: Going Beyond Factorization-based Methods for Transformer Language Models

1 code implementation • 8 Feb 2023 • Mohammadreza Banaei, Klaudia Bałazy, Artur Kasymov, Rémi Lebret, Jacek Tabor, Karl Aberer

Recent transformer language models achieve outstanding results in many natural language processing (NLP) tasks.

Paper
Code

Discovering Language-neutral Sub-networks in Multilingual Language Models

1 code implementation • 25 May 2022 • Negar Foroutan, Mohammadreza Banaei, Remi Lebret, Antoine Bosselut, Karl Aberer

Multilingual pre-trained language models transfer remarkably well on cross-lingual downstream tasks.

Cross-Lingual Transfer

Paper
Code

AdaGrid: Adaptive Grid Search for Link Prediction Training Objective

1 code implementation • 30 Mar 2022 • Tim Poštuvan, Jiaxuan You, Mohammadreza Banaei, Rémi Lebret, Jure Leskovec

To mitigate these limitations, we propose Adaptive Grid Search (AdaGrid), which dynamically adjusts the edge message ratio during training.

BIG-bench Machine Learning Graph Neural Network +1

Paper
Code

Direction is what you need: Improving Word Embedding Compression in Large Language Models

1 code implementation • ACL (RepL4NLP) 2021 • Klaudia Bałazy, Mohammadreza Banaei, Rémi Lebret, Jacek Tabor, Karl Aberer

The adoption of Transformer-based models in natural language processing (NLP) has led to great success using a massive number of parameters.

Language Modelling

Paper
Code

Spoken dialect identification in Twitter using a multi-filter architecture

no code implementations • 5 Jun 2020 • Mohammadreza Banaei, Rémi Lebret, Karl Aberer

This paper presents our approach for SwissText & KONVENS 2020 shared task 2, which is a multi-stage neural model for Swiss German (GSW) identification on Twitter.

Dialect Identification Task 2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.