Search Results for author: Maoyuan Ye

Found 5 papers, 5 papers with code

Hi-SAM: Marrying Segment Anything Model for Hierarchical Text Segmentation

1 code implementation31 Jan 2024 Maoyuan Ye, Jing Zhang, Juhua Liu, Chenyu Liu, BaoCai Yin, Cong Liu, Bo Du, DaCheng Tao

In terms of the AMG mode, Hi-SAM segments text stroke foreground masks initially, then samples foreground points for hierarchical text mask generation and achieves layout analysis in passing.

Hierarchical Text Segmentation Segmentation +1

GoMatching: A Simple Baseline for Video Text Spotting via Long and Short Term Matching

1 code implementation13 Jan 2024 Haibin He, Maoyuan Ye, Jing Zhang, Juhua Liu, DaCheng Tao

In response to this issue, we propose to efficiently turn an off-the-shelf query-based image text spotter into a specialist on video and present a simple baseline termed GoMatching, which focuses the training efforts on tracking while maintaining strong recognition performance.

Text Detection Text Spotting

DeepSolo++: Let Transformer Decoder with Explicit Points Solo for Multilingual Text Spotting

2 code implementations31 May 2023 Maoyuan Ye, Jing Zhang, Shanshan Zhao, Juhua Liu, Tongliang Liu, Bo Du, DaCheng Tao

In this paper, we present DeepSolo++, a simple DETR-like baseline that lets a single decoder with explicit points solo for text detection, recognition, and script identification simultaneously.

Scene Text Detection Text Detection +1

DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting

2 code implementations CVPR 2023 Maoyuan Ye, Jing Zhang, Shanshan Zhao, Juhua Liu, Tongliang Liu, Bo Du, DaCheng Tao

In this paper, we present DeepSolo, a simple DETR-like baseline that lets a single Decoder with Explicit Points Solo for text detection and recognition simultaneously.

 Ranked #1 on Text Spotting on Total-Text (using extra training data)

Scene Text Detection Text Detection +2

DPText-DETR: Towards Better Scene Text Detection with Dynamic Points in Transformer

3 code implementations10 Jul 2022 Maoyuan Ye, Jing Zhang, Shanshan Zhao, Juhua Liu, Bo Du, DaCheng Tao

However, these methods built upon detection transformer framework might achieve sub-optimal training efficiency and performance due to coarse positional query modeling. In addition, the point label form exploited in previous works implies the reading order of humans, which impedes the detection robustness from our observation.

Inductive Bias Scene Text Detection +1

Cannot find the paper you are looking for? You can Submit a new open access paper.