Operationalizing a National Digital Library: The Case for a Norwegian Transformer Model

2 code implementations • NoDaLiDa 2021

In this work, we show the process of building a large-scale training set from digital and digitized collections at a national library.

Language Modelling Optical Character Recognition +1

105

Paper
Code

Robust Digital-Twin Localization via An RGBD-based Transformer Network and A Comprehensive Evaluation on a Mobile Dataset

3 code implementations • 24 Sep 2023

The potential of digital-twin technology, involving the creation of precise digital replicas of physical objects, to reshape AR experiences in 3D object tracking and localization scenarios is significant.

3D Object Tracking Object +1

Paper
Code

Fast Autofocusing using Tiny Transformer Networks for Digital Holographic Microscopy

1 code implementation • 15 Mar 2022

Tiny DL models are proposed and compared such as a tiny Vision Transformer (TViT), tiny VGG16 (TVGG) and a tiny Swin-Transfomer (TSwinT).

Paper
Code

Deep Learning Framework for Measuring the Digital Strategy of Companies from Earnings Calls

1 code implementation • COLING 2020

Companies today are racing to leverage the latest digital technologies, such as artificial intelligence, blockchain, and cloud computing.

Cloud Computing Clustering +2

Paper
Code

TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models

2 code implementations • 21 Sep 2021

Text recognition is a long-standing research problem for document digitalization.

Ranked #3 on Handwritten Text Recognition on IAM

Handwritten Text Recognition Language Modelling +4

124,527

Paper
Code

Digital Twin Tracking Dataset (DTTD): A New RGB+Depth 3D Dataset for Longer-Range Object Tracking Applications

3 code implementations • 12 Feb 2023

Digital twin is a problem of augmenting real objects with their digital counterparts.

3D Object Tracking Object +3

Paper
Code

Nougat: Neural Optical Understanding for Academic Documents

2 code implementations • 25 Aug 2023

Scientific knowledge is predominantly stored in books and scientific journals, often in the form of PDFs.

Optical Character Recognition Optical Character Recognition (OCR)

124,527

Paper
Code

LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding

5 code implementations • ACL 2021

Pre-training of text and layout has proved effective in a variety of visually-rich document understanding tasks due to its effective model architecture and the advantage of large-scale unlabeled scanned/digital-born documents.

Ranked #1 on Key Information Extraction on SROIE

Document Image Classification Document Layout Analysis +6

124,527

Paper
Code

Supervised Multimodal Bitransformers for Classifying Images and Text

6 code implementations • 6 Sep 2019

Self-supervised bidirectional transformer models such as BERT have led to dramatic improvements in a wide variety of textual classification tasks.

Ranked #1 on Natural Language Inference on V-SNLI (using extra training data)

General Classification Natural Language Inference

124,527

Paper
Code

Dynamic Routing Between Capsules

78 code implementations • NeurIPS 2017

We use the length of the activity vector to represent the probability that the entity exists and its orientation to represent the instantiation parameters.

Ranked #1 on Image Classification on MultiMNIST

Image Classification

47,594

Paper
Code

Search Results