no code implementations • 25 Oct 2023 • Yoshinari Fujinuma, Siddharth Varia, Nishant Sankaran, Srikar Appalaraju, Bonan Min, Yogarshi Vyas
Document image classification is different from plain-text document classification and consists of classifying a document by understanding the content and structure of documents such as forms, emails, and other such documents.
1 code implementation • 2 Jun 2023 • Srikar Appalaraju, Peng Tang, Qi Dong, Nishant Sankaran, Yichu Zhou, R. Manmatha
We propose DocFormerv2, a multi-modal transformer for Visual Document Understanding (VDU).
Ranked #9 on Visual Question Answering (VQA) on DocVQA test (using extra training data)
no code implementations • 15 Nov 2018 • Neeti Narayan, Nishant Sankaran, Srirangaraj Setlur, Venu Govindaraju
We present a feature aggregation architecture called Composite Appearance Network (CAN) to address the above problem.