Search Results for author: Wenwen Yu

Found 12 papers, 6 papers with code

OmniParser: A Unified Framework for Text Spotting, Key Information Extraction and Table Recognition

1 code implementation • 28 Mar 2024 • Jianqiang Wan, Sibo Song, Wenwen Yu, Yuliang Liu, Wenqing Cheng, Fei Huang, Xiang Bai, Cong Yao, Zhibo Yang

Recently, visually-situated text parsing (VsTP) has experienced notable advancements, driven by the increasing demand for automated document understanding and the emergence of Generative Large Language Models (LLMs) capable of processing document-based questions.

document understanding Key Information Extraction +3

930

Paper
Code

P2Seg: Pointly-supervised Segmentation via Mutual Distillation

no code implementations • 18 Jan 2024 • Zipeng Wang, Xuehui Yu, Xumeng Han, Wenwen Yu, Zhixun Huang, Jianbin Jiao, Zhenjun Han

Nevertheless, weakly supervised semantic segmentation methods are proficient in utilizing intra-class feature consistency to capture the boundary contours of the same semantic regions.

Box-supervised Instance Segmentation Segmentation +2

Paper
Add Code

P2RBox: A Single Point is All You Need for Oriented Object Detection

no code implementations • 22 Nov 2023 • Guangming Cao, Xuehui Yu, Wenwen Yu, Xumeng Han, Xue Yang, Guorong Li, Jianbin Jiao, Zhenjun Han

In this study, we introduce the P2RBox network, which leverages point annotations and a mask generator to create mask proposals, followed by filtration through our Inspector Module and Constrainer Module.

Object object-detection +2

Paper
Add Code

Turning a CLIP Model into a Scene Text Spotter

1 code implementation • 21 Aug 2023 • Wenwen Yu, Yuliang Liu, Xingkui Zhu, Haoyu Cao, Xing Sun, Xiang Bai

Utilizing only 10% of the supervised data, FastTCM-CR50 improves performance by an average of 26. 5% and 5. 5% for text detection and spotting tasks, respectively.

object-detection Object Detection +3

148

Paper
Code

Looking and Listening: Audio Guided Text Recognition

1 code implementation • 6 Jun 2023 • Wenwen Yu, MingYu Liu, Biao Yang, Enming Zhang, Deqiang Jiang, Xing Sun, Yuliang Liu, Xiang Bai

Text recognition in the wild is a long-standing problem in computer vision.

Scene Text Recognition

Paper
Code

ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images

no code implementations • 5 Jun 2023 • Wenwen Yu, Chengquan Zhang, Haoyu Cao, Wei Hua, Bohan Li, Huang Chen, MingYu Liu, Mingrui Chen, Jianfeng Kuang, Mengjun Cheng, Yuning Du, Shikun Feng, Xiaoguang Hu, Pengyuan Lyu, Kun Yao, Yuechen Yu, Yuliang Liu, Wanxiang Che, Errui Ding, Cheng-Lin Liu, Jiebo Luo, Shuicheng Yan, Min Zhang, Dimosthenis Karatzas, Xing Sun, Jingdong Wang, Xiang Bai

It is hoped that this competition will attract many researchers in the field of CV and NLP, and bring some new thoughts to the field of Document AI.

Document AI Entity Linking +1

Paper
Add Code

ICDAR 2023 Competition on Reading the Seal Title

no code implementations • 24 Apr 2023 • Wenwen Yu, MingYu Liu, Mingrui Chen, Ning Lu, Yinlong Wen, Yuliang Liu, Dimosthenis Karatzas, Xiang Bai

To promote research in this area, we organized ICDAR 2023 competition on reading the seal title (ReST), which included two tasks: seal title text detection (Task 1) and end-to-end seal title recognition (Task 2).

Optical Character Recognition (OCR) Task 2 +1

Paper
Add Code

Turning a CLIP Model into a Scene Text Detector

1 code implementation • CVPR 2023 • Wenwen Yu, Yuliang Liu, Wei Hua, Deqiang Jiang, Bo Ren, Xiang Bai

Recently, pretraining approaches based on vision language models have made effective progresses in the field of text detection.

Domain Adaptation Scene Text Detection +1

148

Paper
Code

ADASYN-Random Forest Based Intrusion Detection Model

no code implementations • 10 May 2021 • Zhewei Chen, Wenwen Yu, Linyue Zhou

Through the comparative experiment of Intrusion detection on CICIDS 2017 dataset, it is found that ADASYN with Random Forest performs better.

Intrusion Detection

Paper
Add Code

Unsupervised Domain Adaptation Network with Category-Centric Prototype Aligner for Biomedical Image Segmentation

no code implementations • 3 Mar 2021 • Ping Gong, Wenwen Yu, Qiuwen Sun, Ruohan Zhao, Junfeng Hu

Specifically, our approach consists of two key modules, a conditional domain discriminator~(CDD) and a category-centric prototype aligner~(CCPA).

Image Segmentation object-detection +4

Paper
Add Code

PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks

2 code implementations • 16 Apr 2020 • Wenwen Yu, Ning Lu, Xianbiao Qi, Ping Gong, Rong Xiao

Computer vision with state-of-the-art deep learning models has achieved huge success in the field of Optical Character Recognition (OCR) including text detection and recognition tasks recently.

Graph Learning Key Information Extraction +3

541

Paper
Code

MASTER: Multi-Aspect Non-local Network for Scene Text Recognition

7 code implementations • 7 Oct 2019 • Ning Lu, Wenwen Yu, Xianbiao Qi, Yihao Chen, Ping Gong, Rong Xiao, Xiang Bai

Attention-based scene text recognizers have gained huge success, which leverages a more compact intermediate representation to learn 1d- or 2d- attention by a RNN-based encoder-decoder architecture.

Scene Text Recognition

4,075

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.