Search Results for author: Jingqun Tang

Found 6 papers, 2 papers with code

TextSquare: Scaling up Text-Centric Visual Instruction Tuning

no code implementations • 19 Apr 2024 • Jingqun Tang, Chunhui Lin, Zhen Zhao, Shu Wei, Binghong Wu, Qi Liu, Hao Feng, Yang Li, Siqi Wang, Lei Liao, Wei Shi, Yuliang Liu, Hao liu, Yuan Xie, Xiang Bai, Can Huang

Text-centric visual question answering (VQA) has made great strides with the development of Multimodal Large Language Models (MLLMs), yet open-source models still fall short of leading models like GPT4V and Gemini, partly due to a lack of extensive, high-quality instruction tuning data.

Hallucination Hallucination Evaluation +2

Paper
Add Code

Multi-modal In-Context Learning Makes an Ego-evolving Scene Text Recognizer

1 code implementation • 22 Nov 2023 • Zhen Zhao, Jingqun Tang, Chunhui Lin, Binghong Wu, Can Huang, Hao liu, Xin Tan, Zhizhong Zhang, Yuan Xie

A straightforward solution is performing model fine-tuning tailored to a specific scenario, but it is computationally intensive and requires multiple model copies for various scenarios.

In-Context Learning Scene Text Recognition

Paper
Code

UniDoc: A Universal Large Multimodal Model for Simultaneous Text Detection, Recognition, Spotting and Understanding

no code implementations • 19 Aug 2023 • Hao Feng, Zijian Wang, Jingqun Tang, Jinghui Lu, Wengang Zhou, Houqiang Li, Can Huang

However, existing advanced algorithms are limited to effectively utilizing the immense representation capabilities and rich world knowledge inherent to these large pre-trained models, and the beneficial connections among tasks within the context of text-rich scenarios have not been sufficiently explored.

Instruction Following Text Detection +1

Paper
Add Code

SPTS v2: Single-Point Scene Text Spotting

3 code implementations • 4 Jan 2023 • Yuliang Liu, Jiaxin Zhang, Dezhi Peng, Mingxin Huang, Xinyu Wang, Jingqun Tang, Can Huang, Dahua Lin, Chunhua Shen, Xiang Bai, Lianwen Jin

Within the context of our SPTS v2 framework, our experiments suggest a potential preference for single-point representation in scene text spotting when compared to other representations.

Ranked #15 on Text Spotting on ICDAR 2015

Text Detection Text Spotting

128

Paper
Code

Optimal Boxes: Boosting End-to-End Scene Text Recognition by Adjusting Annotated Bounding Boxes via Reinforcement Learning

no code implementations • 25 Jul 2022 • Jingqun Tang, Wenming Qian, Luchuan Song, Xiena Dong, Lan Li, Xiang Bai

Text detection and recognition are essential components of a modern OCR system.

Domain Adaptation Optical Character Recognition (OCR) +2

Paper
Add Code

Few Could Be Better Than All: Feature Sampling and Grouping for Scene Text Detection

no code implementations • CVPR 2022 • Jingqun Tang, Wenqing Zhang, Hongye Liu, Mingkun Yang, Bo Jiang, Guanglong Hu, Xiang Bai

Different from previous approaches that learn robust deep representations of scene text in a holistic manner, our method performs scene text detection based on a few representative features, which avoids the disturbance by background and reduces the computational cost.

Ranked #21 on Object Detection In Aerial Images on DOTA (using extra training data)

object-detection Object Detection In Aerial Images +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.