Search Results for author: Guoxin Wang

Found 15 papers, 5 papers with code

XFUND: A Benchmark Dataset for Multilingual Visually Rich Form Understanding

no code implementations Findings (ACL) 2022 Yiheng Xu, Tengchao Lv, Lei Cui, Guoxin Wang, Yijuan Lu, Dinei Florencio, Cha Zhang, Furu Wei

Multimodal pre-training with text, layout, and image has achieved SOTA performance for visually rich document understanding tasks recently, which demonstrates the great potential for joint learning across different modalities.

document understanding

PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient Task Adaptation

no code implementations14 Mar 2024 Yizhe Xiong, Hui Chen, Tianxiang Hao, Zijia Lin, Jungong Han, Yuesong Zhang, Guoxin Wang, Yongjun Bao, Guiguang Ding

Consequently, a simple combination of them cannot guarantee accomplishing both training efficiency and inference efficiency with minimal costs.

Model Compression

A Bi-Pyramid Multimodal Fusion Method for the Diagnosis of Bipolar Disorders

no code implementations15 Jan 2024 Guoxin Wang, Sheng Shi, Shan An, Fengmei Fan, Wenshu Ge, Qi Wang, Feng Yu, Zhiren Wang

Previous research on the diagnosis of Bipolar disorder has mainly focused on resting-state functional magnetic resonance imaging.

Medical Diagnosis

Unsupervised Pre-Training Using Masked Autoencoders for ECG Analysis

no code implementations17 Oct 2023 Guoxin Wang, Qingyuan Wang, Ganesh Neelakanta Iyer, Avishek Nag, Deepu John

Unsupervised learning methods have become increasingly important in deep learning due to their demonstrated large utilization of datasets and higher accuracy in computer vision and natural language processing tasks.

Unsupervised Pre-training

Multi-Dimension-Embedding-Aware Modality Fusion Transformer for Psychiatric Disorder Clasification

no code implementations4 Oct 2023 Guoxin Wang, Xuyang Cao, Shan An, Fengmei Fan, Chao Zhang, Jinsong Wang, Feng Yu, Zhiren Wang

In this work, we proposed a multi-dimension-embedding-aware modality fusion transformer (MFFormer) for schizophrenia and bipolar disorder classification using rs-fMRI and T1 weighted structural MRI (T1w sMRI).

Time Series

Unifying Vision, Text, and Layout for Universal Document Processing

2 code implementations CVPR 2023 Zineng Tang, ZiYi Yang, Guoxin Wang, Yuwei Fang, Yang Liu, Chenguang Zhu, Michael Zeng, Cha Zhang, Mohit Bansal

UDOP leverages the spatial correlation between textual content and document image to model image, text, and layout modalities with one uniform representation.

Ranked #5 on Visual Question Answering (VQA) on InfographicVQA (using extra training data)

document understanding Image Reconstruction +1

Understanding Long Documents with Different Position-Aware Attentions

no code implementations17 Aug 2022 Hai Pham, Guoxin Wang, Yijuan Lu, Dinei Florencio, Cha Zhang

Despite several successes in document understanding, the practical task for long document understanding is largely under-explored due to several challenges in computation and how to efficiently absorb long multimodal input.

document understanding Position

BoningKnife: Joint Entity Mention Detection and Typing for Nested NER via prior Boundary Knowledge

no code implementations20 Jul 2021 Huiqiang Jiang, Guoxin Wang, WEILE CHEN, Chengxi Zhang, Börje F. Karlsson

While named entity recognition (NER) is a key task in natural language processing, most approaches only target flat entities, ignoring nested structures which are common in many scenarios.

named-entity-recognition Named Entity Recognition +3

LayoutXLM: Multimodal Pre-training for Multilingual Visually-rich Document Understanding

6 code implementations18 Apr 2021 Yiheng Xu, Tengchao Lv, Lei Cui, Guoxin Wang, Yijuan Lu, Dinei Florencio, Cha Zhang, Furu Wei

In this paper, we present LayoutXLM, a multimodal pre-trained model for multilingual document understanding, which aims to bridge the language barriers for visually-rich document understanding.

Document Image Classification document understanding

LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding

5 code implementations ACL 2021 Yang Xu, Yiheng Xu, Tengchao Lv, Lei Cui, Furu Wei, Guoxin Wang, Yijuan Lu, Dinei Florencio, Cha Zhang, Wanxiang Che, Min Zhang, Lidong Zhou

Pre-training of text and layout has proved effective in a variety of visually-rich document understanding tasks due to its effective model architecture and the advantage of large-scale unlabeled scanned/digital-born documents.

Document Image Classification Document Layout Analysis +6

Enhanced Meta-Learning for Cross-lingual Named Entity Recognition with Minimal Resources

1 code implementation14 Nov 2019 Qianhui Wu, Zijia Lin, Guoxin Wang, Hui Chen, Börje F. Karlsson, Biqing Huang, Chin-Yew Lin

For languages with no annotated resources, transferring knowledge from rich-resource languages is an effective solution for named entity recognition (NER).

Cross-Lingual NER Meta-Learning +4

Cannot find the paper you are looking for? You can Submit a new open access paper.