Search Results for author: Irina Piontkovskaya

Found 21 papers, 7 papers with code

Ask Me Anything in Your Native Language

no code implementations NAACL 2022 Nikita Sorokin, Dmitry Abulkhanov, Irina Piontkovskaya, Valentin Malykh

Cross-lingual question answering is a thriving field in the modern world, helping people to search information on the web more efficiently.

Cross-Lingual Question Answering Retrieval

Single Example Can Improve Zero-Shot Data Generation

no code implementations INLG (ACL) 2021 Pavel Burnyshev, Valentin Malykh, Andrey Bout, Ekaterina Artemova, Irina Piontkovskaya

We explore two approaches to the generation of task-oriented utterances: in the zero-shot approach, the model is trained to generate utterances from seen intents and is further used to generate utterances for intents unseen during training.

intent-classification Intent Classification +1

AI-generated text boundary detection with RoFT

no code implementations14 Nov 2023 Laida Kushnareva, Tatiana Gaintseva, German Magai, Serguei Barannikov, Dmitry Abulkhanov, Kristian Kuznetsov, Eduard Tulchinskii, Irina Piontkovskaya, Sergey Nikolenko

Due to the rapid development of large language models, people increasingly often encounter texts that may start as written by a human but continue as machine-generated.

Boundary Detection Text Detection +2

Intrinsic Dimension Estimation for Robust Detection of AI-Generated Texts

1 code implementation NeurIPS 2023 Eduard Tulchinskii, Kristian Kuznetsov, Laida Kushnareva, Daniil Cherniavskii, Serguei Barannikov, Irina Piontkovskaya, Sergey Nikolenko, Evgeny Burnaev

Rapidly increasing quality of AI-generated content makes it difficult to distinguish between human and AI-generated texts, which may lead to undesirable consequences for society.

Can BERT eat RuCoLA? Topological Data Analysis to Explain

2 code implementations4 Apr 2023 Irina Proskurina, Irina Piontkovskaya, Ekaterina Artemova

Our results contribute to understanding the behavior of monolingual LMs in the acceptability classification task, provide insights into the functional roles of attention heads, and highlight the advantages of TDA-based approaches for analyzing LMs.

CoLA Linguistic Acceptability +2

PanGu-Σ: Towards Trillion Parameter Language Model with Sparse Heterogeneous Computing

no code implementations20 Mar 2023 Xiaozhe Ren, Pingyi Zhou, Xinfan Meng, Xinjing Huang, Yadao Wang, Weichao Wang, Pengfei Li, Xiaoda Zhang, Alexander Podolskiy, Grigory Arshinov, Andrey Bout, Irina Piontkovskaya, Jiansheng Wei, Xin Jiang, Teng Su, Qun Liu, Jun Yao

In this work, we develop a system that trained a trillion-parameter language model on a cluster of Ascend 910 AI processors and MindSpore framework, and present the language model with 1. 085T parameters named PanGu-{\Sigma}.

Code Generation Language Modelling +4

Topological Data Analysis for Speech Processing

no code implementations30 Nov 2022 Eduard Tulchinskii, Kristian Kuznetsov, Laida Kushnareva, Daniil Cherniavskii, Serguei Barannikov, Irina Piontkovskaya, Sergey Nikolenko, Evgeny Burnaev

We apply topological data analysis (TDA) to speech classification problems and to the introspection of a pretrained speech model, HuBERT.

Topological Data Analysis

Betti numbers of attention graphs is all you really need

1 code implementation5 Jul 2022 Laida Kushnareva, Dmitri Piontkovski, Irina Piontkovskaya

We apply methods of topological analysis to the attention graphs, calculated on the attention heads of the BERT model ( arXiv:1810. 04805v2 ).

text-classification Text Classification

Template-based Approach to Zero-shot Intent Recognition

no code implementations22 Jun 2022 Dmitry Lamanov, Pavel Burnyshev, Ekaterina Artemova, Valentin Malykh, Andrey Bout, Irina Piontkovskaya

We outperform previous state-of-the-art f1-measure by up to 16\% for unseen intents, using intent labels and user utterances and without accessing external sources (such as knowledge bases).

Intent Recognition Natural Language Inference +6

Artificial Text Detection via Examining the Topology of Attention Maps

2 code implementations EMNLP 2021 Laida Kushnareva, Daniil Cherniavskii, Vladislav Mikhailov, Ekaterina Artemova, Serguei Barannikov, Alexander Bernstein, Irina Piontkovskaya, Dmitri Piontkovski, Evgeny Burnaev

The impressive capabilities of recent generative models to create texts that are challenging to distinguish from the human-written ones can be misused for generating fake news, product reviews, and even abusive content.

Text Detection Topological Data Analysis

A Single Example Can Improve Zero-Shot Data Generation

no code implementations16 Aug 2021 Pavel Burnyshev, Valentin Malykh, Andrey Bout, Ekaterina Artemova, Irina Piontkovskaya

In the zero-shot approach, the model is trained to generate utterances from seen intents and is further used to generate utterances for intents unseen during training.

intent-classification Intent Classification +1

Distributed Fine-tuning of Language Models on Private Data

no code implementations ICLR 2018 Vadim Popov, Mikhail Kudinov, Irina Piontkovskaya, Petr Vytovtov, Alex Nevidomsky

In language modeling, users’ language (e. g. in private messaging) could change in a year and be completely different from what we observe in publicly available data.

General Knowledge Language Modelling

Differentially Private Distributed Learning for Language Modeling Tasks

no code implementations20 Dec 2017 Vadim Popov, Mikhail Kudinov, Irina Piontkovskaya, Petr Vytovtov, Alex Nevidomsky

One of the big challenges in machine learning applications is that training data can be different from the real-world data faced by the algorithm.

General Knowledge Language Modelling

Cannot find the paper you are looking for? You can Submit a new open access paper.