1 code implementation • 14 Oct 2023 • Mark Vincent Ty, Rowel Atienza
In this work, the problem of interest is Scene Text Recognition (STR) Explainability, using XAI to understand the cause of an STR model's prediction.
1 code implementation • 23 May 2023 • Rowel Atienza
State of the art (SOTA) neural text to speech (TTS) models can generate natural-sounding synthetic voices.
1 code implementation • 14 Jul 2022 • Darwin Bautista, Rowel Atienza
Context-aware STR methods typically use internal autoregressive (AR) language models (LM).
Ranked #4 on Scene Text Recognition on COCO-Text (using extra training data)
1 code implementation • 22 Apr 2022 • Josen Daniel De Leon, Rowel Atienza
Pruning is a neural network optimization technique that sacrifices accuracy in exchange for lower computational requirements.
1 code implementation • 20 Oct 2021 • Rowel Atienza
Experimental results further show that unlike other regularization terms such as label smoothing, AgMax can take advantage of the data augmentation to consistently improve model generalization by a significant margin.
1 code implementation • 16 Aug 2021 • Rowel Atienza
Scene text recognition (STR) is a challenging task in computer vision due to the large number of possible text appearances in natural scenes.
1 code implementation • 22 May 2021 • Henri Tomas, Marcus Reyes, Raimarc Dionido, Mark Ty, Jonric Mirando, Joel Casimiro, Rowel Atienza, Richard Guinto
To this end, we present a challenging new task called gaze object prediction, where the goal is to predict a bounding box for a person's gazed-at object.
3 code implementations • 18 May 2021 • Rowel Atienza
On a comparable strong baseline method such as TRBA with accuracy of 84. 3%, our small ViTSTR achieves a competitive accuracy of 82. 6% (84. 2% with data augmentation) at 2. 4x speed up, using only 43. 4% of the number of parameters and 42. 2% FLOPS.
Ranked #8 on Scene Text Recognition on ICDAR 2003
2 code implementations • 28 Aug 2020 • Daryl Peralta, Joel Casimiro, Aldrin Michael Nilles, Justine Aletta Aguilar, Rowel Atienza, Rhandley Cajote
Our experiments show that using Scan-RL, the agent can scan houses with fewer number of steps and a shorter distance compared to our baseline circular path.
1 code implementation • IEEE 2019 CVPR Workshop 2019 • Rowel Atienza
PSPU-SkelNet is a pyramid of three U-Nets that predicts the skeleton from a given shape point cloud.
1 code implementation • IEEE 2019 CVPR Workshop 2019 • Rowel Atienza
In computer graphics, point clouds from laser scanning devices are difficult to render into photo-realistic images due to lack of information they carry about color, normal, lighting, and connection between points.
1 code implementation • 19 May 2018 • Rowel Atienza
Disparity estimation is a difficult problem in stereo vision because the correspondence technique fails in images with textureless and repetitive regions.