Search Results for author: Jenhao Hsiao

Found 4 papers, 2 papers with code

Catch Missing Details: Image Reconstruction with Frequency Augmented Variational Autoencoder

1 code implementation • CVPR 2023 • Xinmiao Lin, Yikang Li, Jenhao Hsiao, Chiuman Ho, Yu Kong

The popular VQ-VAE models reconstruct images through learning a discrete codebook but suffer from a significant issue in the rapid quality degradation of image reconstruction as the compression rate rises.

Image Generation Image Reconstruction

Paper
Code

VideoXum: Cross-modal Visual and Textural Summarization of Videos

1 code implementation • 21 Mar 2023 • Jingyang Lin, Hang Hua, Ming Chen, Yikang Li, Jenhao Hsiao, Chiuman Ho, Jiebo Luo

We propose a new joint video and text summarization task.

Ranked #1 on Video Summarization on videoxum

Text Summarization Video Summarization

Paper
Code

Open Vocabulary Multi-Label Classification with Dual-Modal Decoder on Aligned Visual-Textual Features

no code implementations • 19 Aug 2022 • Shichao Xu, Yikang Li, Jenhao Hsiao, Chiuman Ho, Zhu Qi

In computer vision, multi-label recognition are important tasks with many real-world applications, but classifying previously unseen labels remains a significant challenge.

Ranked #1 on Multi-label zero-shot learning on ImageNet-1k to MSCOCO

Classification Multi-Label Classification +1

Paper
Add Code

GCF-Net: Gated Clip Fusion Network for Video Action Recognition

no code implementations • 2 Feb 2021 • Jenhao Hsiao, Jiawei Chen, Chiuman Ho

These models are trained by applying a deep CNN on single clip of fixed temporal length.

Action Recognition Temporal Action Localization

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.