Search Results for author: Irina Piontkovskaya

Found 21 papers, 7 papers with code

InFoBERT: Zero-Shot Approach to Natural Language Understanding Using Contextualized Word Embedding

no code implementations • RANLP 2021 • Pavel Burnyshev, Andrey Bout, Valentin Malykh, Irina Piontkovskaya

Natural language understanding is an important task in modern dialogue systems.

intent-classification Intent Classification +5

Paper
Add Code

Multiple Teacher Distillation for Robust and Greener Models

no code implementations • RANLP 2021 • Artur Ilichev, Nikita Sorokin, Irina Piontkovskaya, Valentin Malykh

The language models nowadays are in the center of natural language processing progress.

Paper
Add Code

Ask Me Anything in Your Native Language

no code implementations • NAACL 2022 • Nikita Sorokin, Dmitry Abulkhanov, Irina Piontkovskaya, Valentin Malykh

Cross-lingual question answering is a thriving field in the modern world, helping people to search information on the web more efficiently.

Cross-Lingual Question Answering Retrieval

Paper
Add Code

Single Example Can Improve Zero-Shot Data Generation

no code implementations • INLG (ACL) 2021 • Pavel Burnyshev, Valentin Malykh, Andrey Bout, Ekaterina Artemova, Irina Piontkovskaya

We explore two approaches to the generation of task-oriented utterances: in the zero-shot approach, the model is trained to generate utterances from seen intents and is further used to generate utterances for intents unseen during training.

intent-classification Intent Classification +1

Paper
Add Code

Efficient Grammatical Error Correction Via Multi-Task Training and Optimized Training Schedule

no code implementations • 20 Nov 2023 • Andrey Bout, Alexander Podolskiy, Sergey Nikolenko, Irina Piontkovskaya

Progress in neural grammatical error correction (GEC) is hindered by the lack of annotated training data.

Grammatical Error Correction

Paper
Add Code

Sinkhorn Transformations for Single-Query Postprocessing in Text-Video Retrieval

no code implementations • 14 Nov 2023 • Konstantin Yakovlev, Gregory Polyakov, Ilseyar Alimova, Alexander Podolskiy, Andrey Bout, Sergey Nikolenko, Irina Piontkovskaya

A recent trend in multimodal retrieval is related to postprocessing test set results via the dual-softmax loss (DSL).

Retrieval Video Retrieval

Paper
Add Code

GEC-DePenD: Non-Autoregressive Grammatical Error Correction with Decoupled Permutation and Decoding

1 code implementation • 14 Nov 2023 • Konstantin Yakovlev, Alexander Podolskiy, Andrey Bout, Sergey Nikolenko, Irina Piontkovskaya

Grammatical error correction (GEC) is an important NLP task that is currently usually solved with autoregressive sequence-to-sequence models.

Denoising Grammatical Error Correction +1

Paper
Code

AI-generated text boundary detection with RoFT

no code implementations • 14 Nov 2023 • Laida Kushnareva, Tatiana Gaintseva, German Magai, Serguei Barannikov, Dmitry Abulkhanov, Kristian Kuznetsov, Eduard Tulchinskii, Irina Piontkovskaya, Sergey Nikolenko

Due to the rapid development of large language models, people increasingly often encounter texts that may start as written by a human but continue as machine-generated.

Boundary Detection Text Detection +2

Paper
Add Code

Intrinsic Dimension Estimation for Robust Detection of AI-Generated Texts

1 code implementation • NeurIPS 2023 • Eduard Tulchinskii, Kristian Kuznetsov, Laida Kushnareva, Daniil Cherniavskii, Serguei Barannikov, Irina Piontkovskaya, Sergey Nikolenko, Evgeny Burnaev

Rapidly increasing quality of AI-generated content makes it difficult to distinguish between human and AI-generated texts, which may lead to undesirable consequences for society.

Paper
Code

Can BERT eat RuCoLA? Topological Data Analysis to Explain

2 code implementations • 4 Apr 2023 • Irina Proskurina, Irina Piontkovskaya, Ekaterina Artemova

Our results contribute to understanding the behavior of monolingual LMs in the acceptability classification task, provide insights into the functional roles of attention heads, and highlight the advantages of TDA-based approaches for analyzing LMs.

Ranked #1 on Linguistic Acceptability on RuCoLA

CoLA Linguistic Acceptability +2

Paper
Code

PanGu-Σ: Towards Trillion Parameter Language Model with Sparse Heterogeneous Computing

no code implementations • 20 Mar 2023 • Xiaozhe Ren, Pingyi Zhou, Xinfan Meng, Xinjing Huang, Yadao Wang, Weichao Wang, Pengfei Li, Xiaoda Zhang, Alexander Podolskiy, Grigory Arshinov, Andrey Bout, Irina Piontkovskaya, Jiansheng Wei, Xin Jiang, Teng Su, Qun Liu, Jun Yao

In this work, we develop a system that trained a trillion-parameter language model on a cluster of Ascend 910 AI processors and MindSpore framework, and present the language model with 1. 085T parameters named PanGu-{\Sigma}.

Code Generation Language Modelling +4

Paper
Add Code

Topological Data Analysis for Speech Processing

no code implementations • 30 Nov 2022 • Eduard Tulchinskii, Kristian Kuznetsov, Laida Kushnareva, Daniil Cherniavskii, Serguei Barannikov, Irina Piontkovskaya, Sergey Nikolenko, Evgeny Burnaev

We apply topological data analysis (TDA) to speech classification problems and to the introspection of a pretrained speech model, HuBERT.

Topological Data Analysis

Paper
Add Code

Betti numbers of attention graphs is all you really need

1 code implementation • 5 Jul 2022 • Laida Kushnareva, Dmitri Piontkovski, Irina Piontkovskaya

We apply methods of topological analysis to the attention graphs, calculated on the attention heads of the BERT model ( arXiv:1810. 04805v2 ).

text-classification Text Classification

Paper
Code

Template-based Approach to Zero-shot Intent Recognition

no code implementations • 22 Jun 2022 • Dmitry Lamanov, Pavel Burnyshev, Ekaterina Artemova, Valentin Malykh, Andrey Bout, Irina Piontkovskaya

We outperform previous state-of-the-art f1-measure by up to 16\% for unseen intents, using intent labels and user utterances and without accessing external sources (such as knowledge bases).

Intent Recognition Natural Language Inference +6

Paper
Add Code

Acceptability Judgements via Examining the Topology of Attention Maps

1 code implementation • 19 May 2022 • Daniil Cherniavskii, Eduard Tulchinskii, Vladislav Mikhailov, Irina Proskurina, Laida Kushnareva, Ekaterina Artemova, Serguei Barannikov, Irina Piontkovskaya, Dmitri Piontkovski, Evgeny Burnaev

The role of the attention mechanism in encoding linguistic knowledge has received special interest in NLP.

Ranked #1 on Linguistic Acceptability on ItaCoLA

CoLA Linguistic Acceptability +2

Paper
Code

Artificial Text Detection via Examining the Topology of Attention Maps

2 code implementations • EMNLP 2021 • Laida Kushnareva, Daniil Cherniavskii, Vladislav Mikhailov, Ekaterina Artemova, Serguei Barannikov, Alexander Bernstein, Irina Piontkovskaya, Dmitri Piontkovski, Evgeny Burnaev

The impressive capabilities of recent generative models to create texts that are challenging to distinguish from the human-written ones can be misused for generating fake news, product reviews, and even abusive content.

Text Detection Topological Data Analysis

Paper
Code

A Single Example Can Improve Zero-Shot Data Generation

no code implementations • 16 Aug 2021 • Pavel Burnyshev, Valentin Malykh, Andrey Bout, Ekaterina Artemova, Irina Piontkovskaya

In the zero-shot approach, the model is trained to generate utterances from seen intents and is further used to generate utterances for intents unseen during training.

intent-classification Intent Classification +1

Paper
Add Code

Revisiting Mahalanobis Distance for Transformer-Based Out-of-Domain Detection

1 code implementation • 11 Jan 2021 • Alexander Podolskiy, Dmitry Lipin, Andrey Bout, Ekaterina Artemova, Irina Piontkovskaya

In turn, the Mahalanobis distance captures this disparity easily.

intent-classification Intent Classification +2

835

Paper
Code

SumTitles: a Summarization Dataset with Low Extractiveness

no code implementations • COLING 2020 • Valentin Malykh, Konstantin Chernis, Ekaterina Artemova, Irina Piontkovskaya

The existing dialogue summarization corpora are significantly extractive.

Abstractive Text Summarization

Paper
Add Code

Distributed Fine-tuning of Language Models on Private Data

no code implementations • ICLR 2018 • Vadim Popov, Mikhail Kudinov, Irina Piontkovskaya, Petr Vytovtov, Alex Nevidomsky

In language modeling, users’ language (e. g. in private messaging) could change in a year and be completely different from what we observe in publicly available data.

General Knowledge Language Modelling

Paper
Add Code

Differentially Private Distributed Learning for Language Modeling Tasks

no code implementations • 20 Dec 2017 • Vadim Popov, Mikhail Kudinov, Irina Piontkovskaya, Petr Vytovtov, Alex Nevidomsky

One of the big challenges in machine learning applications is that training data can be different from the real-world data faced by the algorithm.

General Knowledge Language Modelling

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.