Search Results for author: Binghong Wu

Found 6 papers, 3 papers with code

TextSquare: Scaling up Text-Centric Visual Instruction Tuning

no code implementations • 19 Apr 2024 • Jingqun Tang, Chunhui Lin, Zhen Zhao, Shu Wei, Binghong Wu, Qi Liu, Hao Feng, Yang Li, Siqi Wang, Lei Liao, Wei Shi, Yuliang Liu, Hao liu, Yuan Xie, Xiang Bai, Can Huang

Text-centric visual question answering (VQA) has made great strides with the development of Multimodal Large Language Models (MLLMs), yet open-source models still fall short of leading models like GPT4V and Gemini, partly due to a lack of extensive, high-quality instruction tuning data.

Hallucination Hallucination Evaluation +2

Paper
Add Code

Multi-modal In-Context Learning Makes an Ego-evolving Scene Text Recognizer

1 code implementation • 22 Nov 2023 • Zhen Zhao, Jingqun Tang, Chunhui Lin, Binghong Wu, Can Huang, Hao liu, Xin Tan, Zhizhong Zhang, Yuan Xie

A straightforward solution is performing model fine-tuning tailored to a specific scenario, but it is computationally intensive and requires multiple model copies for various scenarios.

In-Context Learning Scene Text Recognition

Paper
Code

Contrastive Centroid Supervision Alleviates Domain Shift in Medical Image Classification

no code implementations • 31 May 2022 • Wenshuo Zhou, Dalu Yang, Binghong Wu, Yehui Yang, Junde Wu, Xiaorong Wang, Lei Wang, Haifeng Huang, Yanwu Xu

Deep learning based medical imaging classification models usually suffer from the domain shift problem, where the classification performance drops when training data and real-world data differ in imaging equipment manufacturer, image acquisition protocol, patient populations, etc.

domain classification Domain Generalization +3

Paper
Add Code

Progressive Hard-case Mining across Pyramid Levels for Object Detection

1 code implementation • 15 Sep 2021 • Binghong Wu, Yehui Yang, Dalu Yang, Junde Wu, Xiaorong Wang, Haifeng Huang, Lei Wang, Yanwu Xu

Based on focal loss with ATSS-R50, our approach achieves 40. 5 AP, surpassing the state-of-the-art QFL (Quality Focal Loss, 39. 9 AP) and VFL (Varifocal Loss, 40. 1 AP).

object-detection Object Detection

Paper
Code

Robust Collaborative Learning of Patch-level and Image-level Annotations for Diabetic Retinopathy Grading from Fundus Image

1 code implementation • 3 Aug 2020 • Yehui Yang, Fangxin Shang, Binghong Wu, Dalu Yang, Lei Wang, Yanwu Xu, Wensheng Zhang, Tianzhu Zhang

As a result, it exploits more discriminative features for DR grading.

Diabetic Retinopathy Grading Medical Diagnosis

1,694

Paper
Code

Residual-CycleGAN based Camera Adaptation for Robust Diabetic Retinopathy Screening

no code implementations • 31 Jul 2020 • Dalu Yang, Yehui Yang, Tiantian Huang, Binghong Wu, Lei Wang, Yanwu Xu

How can we train a classification model on labeled fundus images ac-quired from only one camera brand, yet still achieves good performance on im-ages taken by other brands of cameras?

Classification Domain Adaptation +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.