Search Results for author: Mohammadreza Banaei

Found 7 papers, 6 papers with code

LoRA-XS: Low-Rank Adaptation with Extremely Small Number of Parameters

1 code implementation27 May 2024 Klaudia Bałazy, Mohammadreza Banaei, Karl Aberer, Jacek Tabor

The recent trend in scaling language models has led to a growing demand for parameter-efficient tuning (PEFT) methods such as LoRA (Low-Rank Adaptation).

Benchmarking GSM8K +1

Breaking the Language Barrier: Improving Cross-Lingual Reasoning with Structured Self-Attention

1 code implementation23 Oct 2023 Negar Foroutan, Mohammadreza Banaei, Karl Aberer, Antoine Bosselut

We evaluate the cross-lingual reasoning abilities of MultiLMs in two schemes: (1) where the language of the context and the question remain the same in the new languages that are tested (i. e., the reasoning is still monolingual, but the model must transfer the learned reasoning ability across languages), and (2) where the language of the context and the question is different (which we term code-switched reasoning).

Logical Reasoning

AdaGrid: Adaptive Grid Search for Link Prediction Training Objective

1 code implementation30 Mar 2022 Tim Poštuvan, Jiaxuan You, Mohammadreza Banaei, Rémi Lebret, Jure Leskovec

To mitigate these limitations, we propose Adaptive Grid Search (AdaGrid), which dynamically adjusts the edge message ratio during training.

BIG-bench Machine Learning Graph Neural Network +1

Direction is what you need: Improving Word Embedding Compression in Large Language Models

1 code implementation ACL (RepL4NLP) 2021 Klaudia Bałazy, Mohammadreza Banaei, Rémi Lebret, Jacek Tabor, Karl Aberer

The adoption of Transformer-based models in natural language processing (NLP) has led to great success using a massive number of parameters.

Language Modelling

Spoken dialect identification in Twitter using a multi-filter architecture

no code implementations5 Jun 2020 Mohammadreza Banaei, Rémi Lebret, Karl Aberer

This paper presents our approach for SwissText & KONVENS 2020 shared task 2, which is a multi-stage neural model for Swiss German (GSW) identification on Twitter.

Dialect Identification Task 2

Cannot find the paper you are looking for? You can Submit a new open access paper.