Search Results for author: Chris Tensmeyer

Found 15 papers, 8 papers with code

DocumentCLIP: Linking Figures and Main Body Text in Reflowed Documents

1 code implementation • 9 Jun 2023 • Fuxiao Liu, Hao Tan, Chris Tensmeyer

In this work, we propose DocumentCLIP, a salience-aware contrastive learning framework to enforce vision-language pretraining models to comprehend the interaction between images and longer text within documents.

Contrastive Learning document understanding

Paper
Code

MGDoc: Pre-training with Multi-granular Hierarchy for Document Image Understanding

no code implementations • 27 Nov 2022 • Zilong Wang, Jiuxiang Gu, Chris Tensmeyer, Nikolaos Barmpalios, Ani Nenkova, Tong Sun, Jingbo Shang, Vlad I. Morariu

In contrast, region-level models attempt to encode regions corresponding to paragraphs or text blocks into a single embedding, but they perform worse with additional word-level features.

Paper
Add Code

End-to-end Document Recognition and Understanding with Dessurt

2 code implementations • 30 Mar 2022 • Brian Davis, Bryan Morse, Bryan Price, Chris Tensmeyer, Curtis Wigington, Vlad Morariu

Dessurt is a more flexible model than prior methods and is able to handle a variety of document domains and tasks.

Ranked #31 on Visual Question Answering (VQA) on DocVQA test

document understanding Visual Question Answering (VQA)

Paper
Code

Towards Language-Free Training for Text-to-Image Generation

no code implementations • CVPR 2022 • Yufan Zhou, Ruiyi Zhang, Changyou Chen, Chunyuan Li, Chris Tensmeyer, Tong Yu, Jiuxiang Gu, Jinhui Xu, Tong Sun

One of the major challenges in training text-to-image generation models is the need of a large number of high-quality text-image pairs.

Zero-Shot Text-to-Image Generation

Paper
Add Code

LAFITE: Towards Language-Free Training for Text-to-Image Generation

2 code implementations • 27 Nov 2021 • Yufan Zhou, Ruiyi Zhang, Changyou Chen, Chunyuan Li, Chris Tensmeyer, Tong Yu, Jiuxiang Gu, Jinhui Xu, Tong Sun

One of the major challenges in training text-to-image generation models is the need of a large number of high-quality image-text pairs.

Ranked #2 on Text-to-Image Generation on Multi-Modal-CelebA-HQ

Zero-Shot Text-to-Image Generation

176

Paper
Code

RPCL: A Framework for Improving Cross-Domain Detection with Auxiliary Tasks

no code implementations • 18 Apr 2021 • Kai Li, Curtis Wigington, Chris Tensmeyer, Vlad I. Morariu, Handong Zhao, Varun Manjunatha, Nikolaos Barmpalios, Yun Fu

Contrasted with prior work, this paper provides a complementary solution to align domains by learning the same auxiliary tasks in both domains simultaneously.

Paper
Add Code

Text and Style Conditioned GAN for Generation of Offline Handwriting Lines

1 code implementation • 1 Sep 2020 • Brian Davis, Chris Tensmeyer, Brian Price, Curtis Wigington, Bryan Morse, Rajiv Jain

This paper presents a GAN for generating images of handwritten lines conditioned on arbitrary text and latent style vectors.

Handwriting generation

Paper
Code

Cross-Domain Document Object Detection: Benchmark Suite and Method

1 code implementation • CVPR 2020 • Kai Li, Curtis Wigington, Chris Tensmeyer, Handong Zhao, Nikolaos Barmpalios, Vlad I. Morariu, Varun Manjunatha, Tong Sun, Yun Fu

We establish a benchmark suite consisting of different types of PDF document datasets that can be utilized for cross-domain DOD model training and evaluation.

object-detection Object Detection

Paper
Code

Deep Visual Template-Free Form Parsing

3 code implementations • 5 Sep 2019 • Brian Davis, Bryan Morse, Scott Cohen, Brian Price, Chris Tensmeyer

Automatic, template-free extraction of information from form images is challenging due to the variety of form layouts.

Paper
Code

Start, Follow, Read: End-to-End Full-Page Handwriting Recognition

1 code implementation • ECCV 2018 • Curtis Wigington, Chris Tensmeyer, Brian Davis, William Barrett, Brian Price, Scott Cohen

Despite decades of research, offline handwriting recognition (HWR) of degraded historical documents remains a challenging problem, which if solved could greatly improve the searchability of online cultural heritage archives.

Ranked #12 on Handwritten Text Recognition on IAM

Handwriting Recognition Handwritten Text Recognition +4

Paper
Code

Language Model Supervision for Handwriting Recognition Model Adaptation

no code implementations • 4 Aug 2018 • Chris Tensmeyer, Curtis Wigington, Brian Davis, Seth Stewart, Tony Martinez, William Barrett

Training state-of-the-art offline handwriting recognition (HWR) models requires large labeled datasets, but unfortunately such datasets are not available in all languages and domains due to the high cost of manual labeling. We address this problem by showing how high resource languages can be leveraged to help train models for low resource languages. We propose a transfer learning methodology where we adapt HWR models trained on a source language to a target language that uses the same writing script. This methodology only requires labeled data in the source language, unlabeled data in the target language, and a language model of the target language.

Handwriting Recognition Language Modelling +1

Paper
Add Code

PageNet: Page Boundary Extraction in Historical Handwritten Documents

3 code implementations • 5 Sep 2017 • Chris Tensmeyer, Brian Davis, Curtis Wigington, Iain Lee, Bill Barrett

When digitizing a document into an image, it is common to include a surrounding border region to visually indicate that the entire document is present in the image.

Paper
Code

Convolutional Neural Networks for Font Classification

no code implementations • 11 Aug 2017 • Chris Tensmeyer, Daniel Saunders, Tony Martinez

This same method also achieves the highest reported accuracy of 86. 6% in predicting paleographic scribal script classes at the page level on medieval Latin manuscripts.

Classification Data Augmentation +3