no code implementations • EACL (VarDial) 2021 • George-Eduard Zaharia, Andrei-Marius Avram, Dumitru-Clementin Cercel, Traian Rebedea
Dialect identification is a task with applicability in a vast array of domains, ranging from automatic speech recognition to opinion mining.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • VarDial (COLING) 2020 • George-Eduard Zaharia, Andrei-Marius Avram, Dumitru-Clementin Cercel, Traian Rebedea
Dialect identification represents a key aspect for improving a series of tasks, for example, opinion mining, considering that the location of the speaker can greatly influence the attitude towards a subject.
no code implementations • 30 Dec 2022 • Răzvan-Alexandru Smădu, George-Eduard Zaharia, Andrei-Marius Avram, Dumitru-Clementin Cercel, Mihai Dascalu, Florin Pop
Keyphrase identification and classification is a Natural Language Processing and Information Retrieval task that involves extracting relevant groups of words from a given text related to the main topic.
no code implementations • ACL 2022 • George-Eduard Zaharia, Răzvan-Alexandru Smădu, Dumitru-Clementin Cercel, Mihai Dascalu
Our model obtains a boost of up to 2. 42% in terms of Pearson Correlation Coefficients in contrast to vanilla training techniques, when considering the CompLex from the Lexical Complexity Prediction 2021 dataset.
no code implementations • SEMEVAL 2021 • George-Eduard Zaharia, Dumitru-Clementin Cercel, Mihai Dascalu
Our models are applicable on both subtasks and achieve good performance results, with a MAE below 0. 07 and a Person correlation of . 73 for single word identification, as well as a MAE below 0. 08 and a Person correlation of . 79 for multiple word targets.
no code implementations • SEMEVAL 2021 • Andrei-Marius Avram, George-Eduard Zaharia, Dumitru-Clementin Cercel, Mihai Dascalu
Extracting semantic information on measurements and counts is an important topic in terms of analyzing scientific discourses.
no code implementations • 2 Oct 2020 • George-Eduard Zaharia, Dumitru-Clementin Cercel, Mihai Dascalu
Our aim is to provide evidence that the proposed models can learn the characteristics of complex words in a multilingual environment by relying on the CWI shared task 2018 dataset available for four different languages (i. e., English, German, Spanish, and also French).
no code implementations • SEMEVAL 2020 • George-Eduard Zaharia, George-Alexandru Vlad, Dumitru-Clementin Cercel, Traian Rebedea, Costin-Gabriel Chiru
In this paper, we describe the systems developed by our team for SemEval-2020 Task 9 that aims to cover two well-known code-mixed languages: Hindi-English and Spanish-English.
no code implementations • SEMEVAL 2020 • George-Alexandru Vlad, George-Eduard Zaharia, Dumitru-Clementin Cercel, Costin-Gabriel Chiru, Stefan Trausan-Matu
Users from the online environment can create different ways of expressing their thoughts, opinions, or conception of amusement.