Search Results for author: Soyeon Caren Han

Found 34 papers, 17 papers with code

PDF-MVQA: A Dataset for Multimodal Information Retrieval in PDF-based Visual Question Answering

no code implementations19 Apr 2024 Yihao Ding, Kaixuan Ren, Jiabin Huang, Siwen Luo, Soyeon Caren Han

Document Question Answering (QA) presents a challenge in understanding visually-rich documents (VRD), particularly those dominated by lengthy textual content like research journal articles.

Information Retrieval Machine Reading Comprehension +3

A Survey of Large Language Models in Finance (FinLLMs)

1 code implementation4 Feb 2024 Jean Lee, Nicholas Stevens, Soyeon Caren Han, Minseok Song

This survey provides a comprehensive overview of FinLLMs, including their history, techniques, performance, and opportunities and challenges.

Named Entity Recognition (NER) Question Answering +4

SCO-VIST: Social Interaction Commonsense Knowledge-based Visual Storytelling

no code implementations1 Feb 2024 Eileen Wang, Soyeon Caren Han, Josiah Poon

This weighted story graph produces the storyline in a sequence of events using Floyd-Warshall's algorithm.

Image Captioning Visual Grounding +1

Re-Temp: Relation-Aware Temporal Representation Learning for Temporal Knowledge Graph Completion

1 code implementation24 Oct 2023 Kunze Wang, Soyeon Caren Han, Josiah Poon

Temporal Knowledge Graph Completion (TKGC) under the extrapolation setting aims to predict the missing entity from a fact in the future, posing a challenge that aligns more closely with real-world prediction problems.

Knowledge Graph Completion Relation +2

MC-DRE: Multi-Aspect Cross Integration for Drug Event/Entity Extraction

1 code implementation12 Aug 2023 Jie Yang, Soyeon Caren Han, Siqu Long, Josiah Poon, Goran Nenadic

Extracting meaningful drug-related information chunks, such as adverse drug events (ADE), is crucial for preventing morbidity and saving many lives.

Event Detection Event Extraction +4

Workshop on Document Intelligence Understanding

no code implementations31 Jul 2023 Soyeon Caren Han, Yihao Ding, Siwen Luo, Josiah Poon, HeeGuen Yoon, Zhe Huang, Paul Duuring, Eun Jung Holden

Document understanding and information extraction include different tasks to understand a document and extract valuable information automatically.

document understanding Visual Question Answering (VQA)

Tri-level Joint Natural Language Understanding for Multi-turn Conversational Datasets

1 code implementation28 May 2023 Henry Weld, Sijia Hu, Siqu Long, Josiah Poon, Soyeon Caren Han

We present a novel tri-level joint natural language understanding approach, adding domain, and explicitly exchange semantic information between all levels.

Intent Detection Natural Language Understanding +3

Graph Neural Networks for Text Classification: A Survey

no code implementations23 Apr 2023 Kunze Wang, Yihao Ding, Soyeon Caren Han

Text Classification is the most essential and fundamental problem in Natural Language Processing.

graph construction text-classification +1

PDFVQA: A New Dataset for Real-World VQA on PDF Documents

no code implementations13 Apr 2023 Yihao Ding, Siwen Luo, Hyunsuk Chung, Soyeon Caren Han

Document-based Visual Question Answering examines the document understanding of document images in conditions of natural language questions.

document understanding Key Information Extraction +2

Form-NLU: Dataset for the Form Natural Language Understanding

1 code implementation4 Apr 2023 Yihao Ding, Siqu Long, Jiabin Huang, Kaixuan Ren, Xingxiang Luo, Hyunsuk Chung, Soyeon Caren Han

Compared to general document analysis tasks, form document structure understanding and retrieval are challenging.

4k Key Information Extraction +4

Spoken Language Understanding for Conversational AI: Recent Advances and Future Direction

no code implementations21 Dec 2022 Soyeon Caren Han, Siqu Long, Henry Weld, Josiah Poon

This tutorial will discuss how the joint task is set up and introduce Spoken Language Understanding/Natural Language Understanding (SLU/NLU) with Deep Learning techniques.

intent-classification Intent Classification +7

PiggyBack: Pretrained Visual Question Answering Environment for Backing up Non-deep Learning Professionals

no code implementations29 Nov 2022 Zhihao Zhang, Siwen Luo, Junyi Chen, Sijia Lai, Siqu Long, Hyunsuk Chung, Soyeon Caren Han

We propose a PiggyBack, a Visual Question Answering platform that allows users to apply the state-of-the-art visual-language pretrained models easily.

Question Answering Visual Question Answering

SG-Shuffle: Multi-aspect Shuffle Transformer for Scene Graph Generation

no code implementations9 Nov 2022 Anh Duc Bui, Soyeon Caren Han, Josiah Poon

Scene Graph Generation (SGG) serves a comprehensive representation of the images for human understanding as well as visual understanding tasks.

Graph Generation Scene Graph Generation

SUPER-Rec: SUrrounding Position-Enhanced Representation for Recommendation

no code implementations9 Sep 2022 Taejun Lim, Siqu Long, Josiah Poon, Soyeon Caren Han

Collaborative filtering problems are commonly solved based on matrix completion techniques which recover the missing values of user-item interaction matrices.

Collaborative Filtering Matrix Completion +4

K-MHaS: A Multi-label Hate Speech Detection Dataset in Korean Online News Comment

1 code implementation COLING 2022 Jean Lee, Taejun Lim, Heejun Lee, Bogeun Jo, Yangsok Kim, HeeGeun Yoon, Soyeon Caren Han

Online hate speech detection has become an important issue due to the growth of online content, but resources in languages other than English are extremely limited.

Hate Speech Detection Multi-Label Classification

Doc-GCN: Heterogeneous Graph Convolutional Networks for Document Layout Analysis

1 code implementation COLING 2022 Siwen Luo, Yihao Ding, Siqu Long, Josiah Poon, Soyeon Caren Han

Recognizing the layout of unstructured digital documents is crucial when parsing the documents into the structured, machine-readable format for downstream applications.

Component Classification Document Layout Analysis

Understanding Attention for Vision-and-Language Tasks

1 code implementation COLING 2022 Feiqi Cao, Soyeon Caren Han, Siqu Long, Changwei Xu, Josiah Poon

Attention mechanism has been used as an important component across Vision-and-Language(VL) tasks in order to bridge the semantic gap between visual and textual features.

Image Retrieval Question Answering +4

InducT-GCN: Inductive Graph Convolutional Networks for Text Classification

1 code implementation1 Jun 2022 Kunze Wang, Soyeon Caren Han, Josiah Poon

Under the extreme settings with no extra resource and limited amount of training set, can we still learn an inductive graph-based text classification model?

text-classification Text Classification

V-Doc : Visual questions answers with Documents

no code implementations27 May 2022 Yihao Ding, Zhe Huang, Runlin Wang, Yanhang Zhang, Xianru Chen, Yuzhong Ma, Hyunsuk Chung, Soyeon Caren Han

We propose V-Doc, a question-answering tool using document images and PDF, mainly for researchers and general non-deep learning experts looking to generate, process, and understand the document visual question answering tasks.

Question Answering Question Generation +2

Vision-and-Language Pretrained Models: A Survey

no code implementations15 Apr 2022 Siqu Long, Feiqi Cao, Soyeon Caren Han, Haiqin Yang

Pretrained models have produced great success in both Computer Vision (CV) and Natural Language Processing (NLP).

Understanding Graph Convolutional Networks for Text Classification

1 code implementation30 Mar 2022 Soyeon Caren Han, Zihan Yuan, Kunze Wang, Siqu Long, Josiah Poon

Graph Convolutional Networks (GCN) have been effective at tasks that have rich relational structure and can preserve global structure information of a dataset in graph embeddings.

graph construction text-classification +1

Bi-directional Joint Neural Networks for Intent Classification and Slot Filling

no code implementations26 Feb 2022 Soyeon Caren Han, Siqu Long, Huichun Li, Henry Weld, Josiah Poon

In this paper, we propose a bi-directional joint model for intent classification and slot filling, which includes a multi-stage hierarchical process via BERT and bi-directional joint natural language understanding mechanisms, including intent2slot and slot2intent, to obtain mutual performance enhancement between intent classification and slot filling.

Classification intent-classification +6

V-Doc: Visual Questions Answers With Documents

no code implementations CVPR 2022 Yihao Ding, Zhe Huang, Runlin Wang, Yanhang Zhang, Xianru Chen, Yuzhong Ma, Hyunsuk Chung, Soyeon Caren Han

We propose V-Doc, a question-answering tool using document images and PDF, mainly for researchers and general non-deep learning experts looking to generate, process, and understand the document visual question answering tasks.

Question Answering Question Generation +2

GLocal-K: Global and Local Kernels for Recommender Systems

3 code implementations27 Aug 2021 Soyeon Caren Han, Taejun Lim, Siqu Long, Bernd Burgstaller, Josiah Poon

Then, the pre-trained auto encoder is fine-tuned with the rating matrix, produced by a convolution-based global kernel, which captures the characteristics of each item.

Collaborative Filtering Matrix Completion +1

FedNLP: An interpretable NLP System to Decode Federal Reserve Communications

1 code implementation11 Jun 2021 Jean Lee, Hoyoul Luis Youn, Nicholas Stevens, Josiah Poon, Soyeon Caren Han

The Federal Reserve System (the Fed) plays a significant role in affecting monetary policy and financial conditions worldwide.

Sentiment Analysis Text Classification

A Survey on Extraction of Causal Relations from Natural Language Text

no code implementations16 Jan 2021 Jie Yang, Soyeon Caren Han, Josiah Poon

Existing causality extraction techniques include knowledge-based, statistical machine learning(ML)-based, and deep learning-based approaches.

BIG-bench Machine Learning Feature Engineering +2

VICTR: Visual Information Captured Text Representation for Text-to-Image Multimodal Tasks

1 code implementation7 Oct 2020 Soyeon Caren Han, Siqu Long, Siwen Luo, Kunze Wang, Josiah Poon

We propose a new visual contextual text representation for text-to-image multimodal tasks, VICTR, which captures rich visual semantic information of objects from the text input.

Ranked #24 on Text-to-Image Generation on MS COCO (Inception score metric)

Dependency Parsing Sentence +1

REXUP: I REason, I EXtract, I UPdate with Structured Compositional Reasoning for Visual Question Answering

1 code implementation27 Jul 2020 Siwen Luo, Soyeon Caren Han, Kaiyuan Sun, Josiah Poon

Visual question answering (VQA) is a challenging multi-modal task that requires not only the semantic understanding of both images and questions, but also the sound perception of a step-by-step reasoning process that would lead to the correct answer.

Question Answering Visual Question Answering

Cannot find the paper you are looking for? You can Submit a new open access paper.