no code implementations • ICON 2020 • Souvick Das, Rajat Pandit, Sudip Kumar Naskar
In this paper, we study and review existing works on stemming in Bengali and other Indian languages.
no code implementations • EACL (DravidianLangTech) 2021 • Avishek Garain, Atanu Mandal, Sudip Kumar Naskar
Offensive language identification has been an active area of research in natural language processing.
no code implementations • 25 Jan 2024 • Amit Barman, Devangan Roy, Debapriya Paul, Indranil Dutta, Shouvik Kumar Guha, Samir Karmakar, Sudip Kumar Naskar
There is an evident lack of implementation of Machine Learning (ML) in the legal domain in India, and any research that does take place in this domain is usually based on data from the higher courts of law and works with English data.
2 code implementations • 19 Jan 2024 • Atanu Mandal, Gargi Roy, Amit Barman, Indranil Dutta, Sudip Kumar Naskar
With the recent surge and exponential growth of social media usage, scrutinizing social media content for the presence of any hateful content is of utmost importance.
no code implementations • 5 Nov 2023 • Madhusudan Ghosh, Debasis Ganguly, Partha Basuchowdhuri, Sudip Kumar Naskar
Research in scientific disciplines evolves, often rapidly, over time with the emergence of novel methodologies and their associated terminologies.
1 code implementation • 20 Mar 2023 • Sohom Ghosh, Ankush Chopra, Sudip Kumar Naskar
Every industry has terms that are specific to the domain it operates in.
1 code implementation • 1 Nov 2022 • Anubhav Sarkar, Swagata Chakraborty, Sohom Ghosh, Sudip Kumar Naskar
This paper investigates the impact of social media posts on close price prediction of stocks using Twitter and Reddit posts.
1 code implementation • 26 Jan 2022 • Sohom Ghosh, Sudip Kumar Naskar
It extracts context embeddings of the numerals using one of the transformer based pre-trained language model called BERT.
no code implementations • 31 Dec 2021 • Afia Fairoose Abedin, Amirul Islam Al Mamun, Rownak Jahan Nowrin, Amitabha Chakrabarty, Moin Mostakim, Sudip Kumar Naskar
Extracting the client reviews from conversations by using chatbots, organizations can reduce the major gap of understanding between the users and the chatbot and improve their quality of products and services. Thus, in our research we incorporated all the key elements that are necessary for a chatbot to analyse and understand an input text precisely and accurately.
no code implementations • 5 Oct 2021 • Atanu Mandal, Santanu Pal, Indranil Dutta, Mahidas Bhattacharya, Sudip Kumar Naskar
Language Identification (LID) is a crucial preliminary process in the field of Automatic Speech Recognition (ASR) that involves the identification of a spoken language from audio samples.
Ranked #1 on Spoken language identification on IndicTTS
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • 31 Aug 2020 • Somnath Banerjee, Sudip Kumar Naskar, Paolo Rosso, Sivaji Bandyopadhyay
Overall, the stacking approach produces the best results for fine-grained classification and achieves 87. 79% of accuracy.
no code implementations • LREC 2020 • Anisha Datta, Shukrity Si, Urbi Chakraborty, Sudip Kumar Naskar
In the last few years, hate speech and aggressive comments have covered almost all the social media platforms like facebook, twitter etc.
no code implementations • COLING 2020 • Santanu Pal, Hongfei Xu, Nico Herbig, Sudip Kumar Naskar, Antonio Krueger, Josef van Genabith
In automatic post-editing (APE) it makes sense to condition post-editing (pe) decisions on both the source (src) and the machine translated text (mt) as input.
no code implementations • WS 2019 • Mihaela Vela, Santanu Pal, Marcos Zampieri, Sudip Kumar Naskar, Josef van Genabith
User feedback revealed that the users preferred using CATaLog Online over existing CAT tools in some respects, especially by selecting the output of the MT system and taking advantage of the color scheme for TM suggestions.
no code implementations • WS 2019 • Riktim Mondal, Shankha Raj Nayek, Aditya Chowdhury, Santanu Pal, Sudip Kumar Naskar, Josef van Genabith
In this paper we describe our joint submission (JU-Saarland) from Jadavpur University and Saarland University in the WMT 2019 news translation shared task for English{--}Gujarati language pair within the translation task sub-track.
1 code implementation • SEMEVAL 2019 • Preeti Mukherjee, Mainak Pal, Somnath Banerjee, Sudip Kumar Naskar
This paper describes our system submissions as part of our participation (team name: JU{\_}ETCE{\_}17{\_}21) in the SemEval 2019 shared task 6: {``}OffensEval: Identifying and Catego- rizing Offensive Language in Social Media{''}.
no code implementations • WS 2018 • Prasenjit Basu, Santanu Pal, Sudip Kumar Naskar
The paper presents our participation in the WMT 2018 shared task on word level quality estimation (QE) of machine translated (MT) text, i. e., to predict whether a word in MT output for a given source context is correctly translated and hence should be retained in the post-edited translation (PE), or not.
no code implementations • WS 2018 • Joybrata Panja, Sudip Kumar Naskar
The paper presents our participation in the WMT 2018 Metrics Shared Task.
no code implementations • EACL 2017 • Santanu Pal, Sudip Kumar Naskar, Mihaela Vela, Qun Liu, Josef van Genabith
APE translations produced by our system show statistically significant improvements over the first-stage MT, phrase-based APE and the best reported score on the WMT 2016 APE dataset by a previous neural APE system.
no code implementations • COLING 2016 • Santanu Pal, Sudip Kumar Naskar, Marcos Zampieri, Tapas Nayak, Josef van Genabith
We present a free web-based CAT tool called CATaLog Online which provides a novel and user-friendly online CAT environment for post-editors/translators.
no code implementations • COLING 2016 • Santanu Pal, Sudip Kumar Naskar, Josef van Genabith
In the paper we show that parallel system combination in the APE stage of a sequential MT-APE combination yields substantial translation improvements both measured in terms of automatic evaluation metrics as well as in terms of productivity improvements measured in a post-editing experiment.
no code implementations • LREC 2016 • Santanu Pal, Marcos Zampieri, Sudip Kumar Naskar, Tapas Nayak, Mihaela Vela, Josef van Genabith
The tool features a number of editing and log functions similar to the desktop version of CATaLog enhanced with several new features that we describe in detail in this paper.
no code implementations • LREC 2014 • Santanu Pal, Sudip Kumar Naskar, B, Sivaji yopadhyay
Reordering poses a big challenge in statistical machine translation between distant language pairs.