Search Results for author: Vasile Păiş

Found 9 papers, 2 papers with code

HistNERo: Historical Named Entity Recognition for the Romanian Language

no code implementations • 30 Apr 2024 • Andrei-Marius Avram, Andreea Iuga, George-Vlad Manolache, Vlad-Cristian Matei, Răzvan-Gabriel Micliuş, Vlad-Andrei Muntean, Manuel-Petru Sorlescu, Dragoş-Andrei Şerban, Adrian-Dinu Urse, Vasile Păiş, Dumitru-Clementin Cercel

This work introduces HistNERo, the first Romanian corpus for Named Entity Recognition (NER) in historical newspapers.

Paper
Add Code

Towards Improving the Performance of Pre-Trained Speech Models for Low-Resource Languages Through Lateral Inhibition

no code implementations • 30 Jun 2023 • Andrei-Marius Avram, Răzvan-Alexandru Smădu, Vasile Păiş, Dumitru-Clementin Cercel, Radu Ion, Dan Tufiş

With the rise of bidirectional encoder representations from Transformer models in natural language processing, the speech community has adopted some of their development methodologies.

Paper
Add Code

Multilingual Multiword Expression Identification Using Lateral Inhibition and Domain Adaptation

no code implementations • 17 Jun 2023 • Andrei-Marius Avram, Verginica Barbu Mititelu, Vasile Păiş, Dumitru-Clementin Cercel, Ştefan Trăuşan-Matu

Correctly identifying multiword expressions (MWEs) is an important task for most natural language processing systems since their misidentification can result in ambiguity and misunderstanding of the underlying text.

Domain Adaptation

Paper
Add Code

An Open-Domain QA System for e-Governance

no code implementations • CLIB 2022 • Radu Ion, Andrei-Marius Avram, Vasile Păiş, Maria Mitrofan, Verginica Barbu Mititelu, Elena Irimia, Valentin Badea

The paper will present the QA system and its integration with the Romanian language technologies portal RELATE, the COVID-19 data set and different evaluations of the QA performance.

Open-Domain Question Answering

Paper
Add Code

Distilling the Knowledge of Romanian BERTs Using Multiple Teachers

1 code implementation • LREC 2022 • Andrei-Marius Avram, Darius Catrina, Dumitru-Clementin Cercel, Mihai Dascălu, Traian Rebedea, Vasile Păiş, Dan Tufiş

In this work, we introduce three light and fast versions of distilled BERT models for the Romanian language: Distil-BERT-base-ro, Distil-RoBERT-base, and DistilMulti-BERT-base-ro.

Dialect Identification Knowledge Distillation +9

Paper
Code

Romanian Speech Recognition Experiments from the ROBIN Project

1 code implementation • 23 Nov 2021 • Andrei-Marius Avram, Vasile Păiş, Dan Tufiş

One of the fundamental functionalities for accepting a socially assistive robot is its communication capabilities with other agents in the environment.

Language Modelling speech-recognition +1

Paper
Code

Human-Machine Interaction Speech Corpus from the ROBIN project

no code implementations • 22 Nov 2021 • Vasile Păiş, Radu Ion, Andrei-Marius Avram, Elena Irimia, Verginica Barbu Mititelu, Maria Mitrofan

The paper contains a detailed description of the acquisition process, corpus statistics as well as an evaluation of the corpus influence on a low-latency ASR system as well as a dialogue component.

Paper
Add Code

More Romanian word embeddings from the RETEROM project

no code implementations • 21 Nov 2021 • Vasile Păiş, Dan Tufiş

To this end, the previously created sets of word embeddings (based on word occurrences) on the CoRoLa corpus (P\u{a}i\c{s} and Tufi\c{s}, 2018) are and will be further augmented with new representations learned from the same corpus by using specific features such as lemmas and parts of speech.

LEMMA Word Embeddings

Paper
Add Code

Capitalization and Punctuation Restoration: a Survey

no code implementations • 21 Nov 2021 • Vasile Păiş, Dan Tufiş

Ensuring proper punctuation and letter casing is a key pre-processing step towards applying complex natural language processing algorithms.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.