Search Results for author: Guoxin Wang

Found 15 papers, 5 papers with code

XFUND: A Benchmark Dataset for Multilingual Visually Rich Form Understanding

no code implementations • Findings (ACL) 2022 • Yiheng Xu, Tengchao Lv, Lei Cui, Guoxin Wang, Yijuan Lu, Dinei Florencio, Cha Zhang, Furu Wei

Multimodal pre-training with text, layout, and image has achieved SOTA performance for visually rich document understanding tasks recently, which demonstrates the great potential for joint learning across different modalities.

document understanding

Paper
Add Code

PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient Task Adaptation

no code implementations • 14 Mar 2024 • Yizhe Xiong, Hui Chen, Tianxiang Hao, Zijia Lin, Jungong Han, Yuesong Zhang, Guoxin Wang, Yongjun Bao, Guiguang Ding

Consequently, a simple combination of them cannot guarantee accomplishing both training efficiency and inference efficiency with minimal costs.

Model Compression

Paper
Add Code

A Bi-Pyramid Multimodal Fusion Method for the Diagnosis of Bipolar Disorders

no code implementations • 15 Jan 2024 • Guoxin Wang, Sheng Shi, Shan An, Fengmei Fan, Wenshu Ge, Qi Wang, Feng Yu, Zhiren Wang

Previous research on the diagnosis of Bipolar disorder has mainly focused on resting-state functional magnetic resonance imaging.

Medical Diagnosis

Paper
Add Code

Unsupervised Pre-Training Using Masked Autoencoders for ECG Analysis

no code implementations • 17 Oct 2023 • Guoxin Wang, Qingyuan Wang, Ganesh Neelakanta Iyer, Avishek Nag, Deepu John

Unsupervised learning methods have become increasingly important in deep learning due to their demonstrated large utilization of datasets and higher accuracy in computer vision and natural language processing tasks.

Unsupervised Pre-training

Paper
Add Code

Multi-Dimension-Embedding-Aware Modality Fusion Transformer for Psychiatric Disorder Clasification

no code implementations • 4 Oct 2023 • Guoxin Wang, Xuyang Cao, Shan An, Fengmei Fan, Chao Zhang, Jinsong Wang, Feng Yu, Zhiren Wang

In this work, we proposed a multi-dimension-embedding-aware modality fusion transformer (MFFormer) for schizophrenia and bipolar disorder classification using rs-fMRI and T1 weighted structural MRI (T1w sMRI).

Time Series

Paper
Add Code

Kosmos-2.5: A Multimodal Literate Model

no code implementations • 20 Sep 2023 • Tengchao Lv, Yupan Huang, Jingye Chen, Lei Cui, Shuming Ma, Yaoyao Chang, Shaohan Huang, Wenhui Wang, Li Dong, Weiyao Luo, Shaoxiang Wu, Guoxin Wang, Cha Zhang, Furu Wei

We present Kosmos-2. 5, a multimodal literate model for machine reading of text-intensive images.

Reading Comprehension Text Generation

Paper
Add Code

Unifying Vision, Text, and Layout for Universal Document Processing

2 code implementations • CVPR 2023 • Zineng Tang, ZiYi Yang, Guoxin Wang, Yuwei Fang, Yang Liu, Chenguang Zhu, Michael Zeng, Cha Zhang, Mohit Bansal

UDOP leverages the spatial correlation between textual content and document image to model image, text, and layout modalities with one uniform representation.

Ranked #5 on Visual Question Answering (VQA) on InfographicVQA (using extra training data)

document understanding Image Reconstruction +1

1,634

Paper
Code

Understanding Long Documents with Different Position-Aware Attentions

no code implementations • 17 Aug 2022 • Hai Pham, Guoxin Wang, Yijuan Lu, Dinei Florencio, Cha Zhang

Despite several successes in document understanding, the practical task for long document understanding is largely under-explored due to several challenges in computation and how to efficiently absorb long multimodal input.

document understanding Position

Paper
Add Code

BoningKnife: Joint Entity Mention Detection and Typing for Nested NER via prior Boundary Knowledge

no code implementations • 20 Jul 2021 • Huiqiang Jiang, Guoxin Wang, WEILE CHEN, Chengxi Zhang, Börje F. Karlsson

While named entity recognition (NER) is a key task in natural language processing, most approaches only target flat entities, ignoring nested structures which are common in many scenarios.

Ranked #1 on Nested Mention Recognition on ACE 2005

named-entity-recognition Named Entity Recognition +3

Paper
Add Code

Understanding Chinese Video and Language via Contrastive Multimodal Pre-Training

no code implementations • 19 Apr 2021 • Chenyi Lei, Shixian Luo, Yong liu, Wanggui He, Jiamang Wang, Guoxin Wang, Haihong Tang, Chunyan Miao, Houqiang Li

The pre-trained neural models have recently achieved impressive performances in understanding multimodal content.

Contrastive Learning Language Modelling +2

Paper
Add Code

LayoutXLM: Multimodal Pre-training for Multilingual Visually-rich Document Understanding

6 code implementations • 18 Apr 2021 • Yiheng Xu, Tengchao Lv, Lei Cui, Guoxin Wang, Yijuan Lu, Dinei Florencio, Cha Zhang, Furu Wei

In this paper, we present LayoutXLM, a multimodal pre-trained model for multilingual document understanding, which aims to bridge the language barriers for visually-rich document understanding.

Ranked #13 on Document Image Classification on RVL-CDIP

Document Image Classification document understanding

125,059

Paper
Code

LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding

5 code implementations • ACL 2021 • Yang Xu, Yiheng Xu, Tengchao Lv, Lei Cui, Furu Wei, Guoxin Wang, Yijuan Lu, Dinei Florencio, Cha Zhang, Wanxiang Che, Min Zhang, Lidong Zhou

Pre-training of text and layout has proved effective in a variety of visually-rich document understanding tasks due to its effective model architecture and the advantage of large-scale unlabeled scanned/digital-born documents.

Ranked #1 on Key Information Extraction on SROIE

Document Image Classification Document Layout Analysis +6

125,059

Paper
Code

Pre-training Graph Transformer with Multimodal Side Information for Recommendation

no code implementations • 23 Oct 2020 • Yong liu, Susen Yang, Chenyi Lei, Guoxin Wang, Haihong Tang, Juyong Zhang, Aixin Sun, Chunyan Miao

Side information of items, e. g., images and text description, has shown to be effective in contributing to accurate recommendations.

Recommendation Systems Unsupervised Pre-training

Paper
Add Code

Enhanced Meta-Learning for Cross-lingual Named Entity Recognition with Minimal Resources

1 code implementation • 14 Nov 2019 • Qianhui Wu, Zijia Lin, Guoxin Wang, Hui Chen, Börje F. Karlsson, Biqing Huang, Chin-Yew Lin

For languages with no annotated resources, transferring knowledge from rich-resource languages is an effective solution for named entity recognition (NER).

Ranked #1 on Cross-Lingual NER on MSRA

Cross-Lingual NER Meta-Learning +4

262

Paper
Code

CAN-NER: Convolutional Attention Network for Chinese Named Entity Recognition

1 code implementation • NAACL 2019 • Yuying Zhu, Guoxin Wang, Börje F. Karlsson

Named entity recognition (NER) in Chinese is essential but difficult because of the lack of natural delimiters.

Ranked #1 on Chinese Named Entity Recognition on Weibo NER (Accuracy-NE metric)

Chinese Named Entity Recognition named-entity-recognition +4

262

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.