Search Results for author: Sudipta Kar

Found 23 papers, 4 papers with code

SemEval-2022 Task 11: Multilingual Complex Named Entity Recognition (MultiCoNER)

no code implementations SemEval (NAACL) 2022 Shervin Malmasi, Anjie Fang, Besnik Fetahu, Sudipta Kar, Oleg Rokhlenko

Divided into 13 tracks, the task focused on methods to identify complex named entities (like names of movies, products and groups) in 11 languages in both monolingual and multi-lingual scenarios.

named-entity-recognition Named Entity Recognition +1

Chain-of-Instructions: Compositional Instruction Tuning on Large Language Models

no code implementations18 Feb 2024 Shirley Anugrah Hayati, Taehee Jung, Tristan Bodding-Long, Sudipta Kar, Abhinav Sethy, Joo-Kyung Kim, Dongyeop Kang

Fine-tuning large language models (LLMs) with a collection of large and diverse instructions has improved the model's generalization to different tasks, even for unseen tasks.

Integrating Summarization and Retrieval for Enhanced Personalization via Large Language Models

no code implementations30 Oct 2023 Chris Richardson, Yao Zhang, Kellen Gillespie, Sudipta Kar, Arshdeep Singh, Zeynab Raeesy, Omar Zia Khan, Abhinav Sethy

To overcome these limitations, we propose a novel summary-augmented approach by extending retrieval-augmented personalization with task-aware user summaries generated by LLMs.

Language Modelling Retrieval

MultiCoNER v2: a Large Multilingual dataset for Fine-grained and Noisy Named Entity Recognition

no code implementations20 Oct 2023 Besnik Fetahu, Zhiyu Chen, Sudipta Kar, Oleg Rokhlenko, Shervin Malmasi

We present MULTICONER V2, a dataset for fine-grained Named Entity Recognition covering 33 entity classes across 12 languages, in both monolingual and multilingual settings.

named-entity-recognition Named Entity Recognition +2

Preventing Catastrophic Forgetting in Continual Learning of New Natural Language Tasks

no code implementations22 Feb 2023 Sudipta Kar, Giuseppe Castellucci, Simone Filice, Shervin Malmasi, Oleg Rokhlenko

In this paper, we approach the problem of incrementally expanding MTL models' capability to solve new tasks over time by distilling the knowledge of an already trained model on n tasks into a new one for solving n+1 tasks.

Continual Learning Multi-Task Learning

Learning to Retrieve Engaging Follow-Up Queries

1 code implementation21 Feb 2023 Christopher Richardson, Sudipta Kar, Anjishnu Kumar, Anand Ramachandran, Omar Zia Khan, Zeynab Raeesy, Abhinav Sethy

The retrieval system is trained on a dataset which contains ~14K multi-turn information-seeking conversations with a valid follow-up question and a set of invalid candidates.

Retrieval valid

MultiCoNER: A Large-scale Multilingual dataset for Complex Named Entity Recognition

no code implementations COLING 2022 Shervin Malmasi, Anjie Fang, Besnik Fetahu, Sudipta Kar, Oleg Rokhlenko

We present MultiCoNER, a large multilingual dataset for Named Entity Recognition that covers 3 domains (Wiki sentences, questions, and search queries) across 11 languages, as well as multilingual and code-mixing subsets.

Machine Translation named-entity-recognition +3

LinCE: A Centralized Benchmark for Linguistic Code-switching Evaluation

no code implementations LREC 2020 Gustavo Aguilar, Sudipta Kar, Thamar Solorio

To facilitate research in this direction, we propose a centralized benchmark for Linguistic Code-switching Evaluation (LinCE) that combines ten corpora covering four different code-switched language pairs (i. e., Spanish-English, Nepali-English, Hindi-English, and Modern Standard Arabic-Egyptian Arabic) and four tasks (i. e., language identification, named entity recognition, part-of-speech tagging, and sentiment analysis).

Language Identification named-entity-recognition +4

BanFakeNews: A Dataset for Detecting Fake News in Bangla

1 code implementation LREC 2020 Md Zobaer Hossain, Md Ashraful Rahman, Md. Saiful Islam, Sudipta Kar

In this work, we propose an annotated dataset of ~50K news that can be used for building automated fake news detection systems for a low resource language like Bangla.

Fake News Detection

Multi-view Story Characterization from Movie Plot Synopses and Reviews

no code implementations EMNLP 2020 Sudipta Kar, Gustavo Aguilar, Mirella Lapata, Thamar Solorio

This paper considers the problem of characterizing stories by inferring properties such as theme and style using written synopses and reviews of movies.

TAG

Folksonomication: Predicting Tags for Movies from Plot Synopses Using Emotion Flow Encoded Neural Network

no code implementations COLING 2018 Sudipta Kar, Suraj Maharjan, Thamar Solorio

Folksonomy of movies covers a wide range of heterogeneous information about movies, like the genre, plot structure, visual experiences, soundtracks, metadata, and emotional experiences from watching a movie.

Retrieval

UH-PRHLT at SemEval-2016 Task 3: Combining Lexical and Semantic-based Features for Community Question Answering

no code implementations SEMEVAL 2016 Marc Franco-Salvador, Sudipta Kar, Thamar Solorio, Paolo Rosso

In this work we describe the system built for the three English subtasks of the SemEval 2016 Task 3 by the Department of Computer Science of the University of Houston (UH) and the Pattern Recognition and Human Language Technology (PRHLT) research center - Universitat Polit`ecnica de Val`encia: UH-PRHLT.

Community Question Answering Knowledge Graphs

RiTUAL-UH at SemEval-2017 Task 5: Sentiment Analysis on Financial Data Using Neural Networks

no code implementations SEMEVAL 2017 Sudipta Kar, Suraj Maharjan, Thamar Solorio

In this paper, we present our systems for the {``}SemEval-2017 Task-5 on Fine-Grained Sentiment Analysis on Financial Microblogs and News{''}.

Sentiment Analysis

Cannot find the paper you are looking for? You can Submit a new open access paper.