1 code implementation • 2 Apr 2024 • Suman Adhya, Debarshi Kumar Sanyal
Topic modeling is a widely used approach for analyzing and exploring large document collections.
no code implementations • 28 Nov 2023 • Soumya Banerjee, Debarshi Kumar Sanyal, Samiran Chattopadhyay, Plaban Kumar Bhowmick, Partha Pratim Das
Digital libraries often face the challenge of processing a large volume of diverse document types.
Document Image Classification Optical Character Recognition (OCR)
1 code implementation • 28 Sep 2023 • Tohida Rehman, Ronit Mandal, Abhishek Agarwal, Debarshi Kumar Sanyal
We have used the following metrics to measure factual consistency at the entity level: precision-source, and F1-target.
1 code implementation • 25 Apr 2023 • Avishek Lahiri, Debarshi Kumar Sanyal, Imon Mukherjee
For the ACL-ARC dataset, we report a 53. 86% F1 score for the zero-shot setting, which improves to 63. 61% and 66. 99% for the 5-shot and 10-shot settings, respectively.
Ranked #2 on Citation Intent Classification on ACL-ARC
1 code implementation • PoliticalNLP (LREC) 2022 • Suman Adhya, Debarshi Kumar Sanyal
The TCPD-IPD dataset is a collection of questions and answers discussed in the Lower House of the Parliament of India during the Question Hour between 1999 and 2019.
1 code implementation • 28 Mar 2023 • Suman Adhya, Avishek Lahiri, Debarshi Kumar Sanyal
Dropout is a widely used regularization trick to resolve the overfitting issue in large feedforward neural networks trained on a small dataset, which performs poorly on the held-out test subset.
1 code implementation • 27 Mar 2023 • Suman Adhya, Debarshi Kumar Sanyal
Topic modeling is a dominant method for exploring document collections on the web and in digital libraries.
2 code implementations • 27 Mar 2023 • Suman Adhya, Avishek Lahiri, Debarshi Kumar Sanyal, Partha Pratim Das
Topic modeling has emerged as a dominant method for exploring large document collections.
no code implementations • 25 Feb 2023 • Tohida Rehman, Suchandan Das, Debarshi Kumar Sanyal, Samiran Chattopadhyay
Indeed automatic text summarization has emerged as an important application of machine learning in text processing.
no code implementations • 25 Feb 2023 • Tohida Rehman, Suchandan Das, Debarshi Kumar Sanyal, Samiran Chattopadhyay
People nowadays use search engines like Google, Yahoo, and Bing to find information on the Internet.
no code implementations • sdp (COLING) 2022 • Tohida Rehman, Debarshi Kumar Sanyal, Prasenjit Majumder, Samiran Chattopadhyay
We investigate whether the use of named entity recognition on the input improves the quality of the generated highlights.
1 code implementation • 14 Feb 2023 • Tohida Rehman, Debarshi Kumar Sanyal, Samiran Chattopadhyay, Plaban Kumar Bhowmick, Partha Pratim Das
On the new MixSub dataset, where only the abstract is the input, our proposed model (when trained on the whole training corpus without distinguishing between the subject categories) achieves ROUGE-1, ROUGE-2 and ROUGE-L F1-scores of 31. 78, 9. 76 and 29. 3, respectively, METEOR score of 24. 00, and BERTScore F1 of 85. 25.
no code implementations • 25 Apr 2022 • Prantika Chakraborty, Sudakshina Dutta, Debarshi Kumar Sanyal
Maintaining research-related information in an organized manner can be challenging for a researcher.
1 code implementation • Extraction and Evaluation of Knowledge Entities from Scientific Documents 2021 • T Y S S Santosh, Prantika Chakraborty, Sudakshina Dutta, Debarshi Kumar Sanyal, Partha Pratim Das
Scientific articles contain various types of domain-specific entities and relations between them.
Ranked #3 on Joint Entity and Relation Extraction on SciERC
Joint Entity and Relation Extraction Joint Entity and Relation Extraction on Scientific Data +3
no code implementations • COLING 2020 • T.y.s.s Santosh, Debarshi Kumar Sanyal, Plaban Kumar Bhowmick, Partha Pratim Das
Keyphrases in a research paper succinctly capture the primary content of the paper and also assist in indexing the paper at a concept level.
1 code implementation • 11 May 2020 • Soumya Banerjee, Debarshi Kumar Sanyal, Samiran Chattopadhyay, Plaban Kumar Bhowmick, Parthapratim Das
In the biomedical literature, it is customary to structure an abstract into discourse categories like BACKGROUND, OBJECTIVE, METHOD, RESULT, and CONCLUSION, but this segmentation is uncommon in other fields like computer science.