Search Results for author: Austin Myers

Found 6 papers, 4 papers with code

Streaming Dense Video Captioning

1 code implementation • 1 Apr 2024 • Xingyi Zhou, Anurag Arnab, Shyamal Buch, Shen Yan, Austin Myers, Xuehan Xiong, Arsha Nagrani, Cordelia Schmid

An ideal model for dense video captioning -- predicting captions localized temporally in a video -- should be able to handle long input videos, predict rich, detailed textual descriptions, and be able to produce outputs before processing the entire video.

Dense Video Captioning

3,026

Paper
Code

IC3: Image Captioning by Committee Consensus

1 code implementation • 2 Feb 2023 • David M. Chan, Austin Myers, Sudheendra Vijayanarasimhan, David A. Ross, John Canny

If you ask a human to describe an image, they might do so in a thousand different ways.

Image Captioning

Paper
Code

Open-Vocabulary Temporal Action Detection with Off-the-Shelf Image-Text Features

no code implementations • 20 Dec 2022 • Vivek Rathod, Bryan Seybold, Sudheendra Vijayanarasimhan, Austin Myers, Xiuye Gu, Vighnesh Birodkar, David A. Ross

Detecting actions in untrimmed videos should not be limited to a small, closed set of classes.

Action Detection Optical Flow Estimation

Paper
Add Code

Distribution Aware Metrics for Conditional Natural Language Generation

no code implementations • 15 Sep 2022 • David M Chan, Yiming Ni, David A Ross, Sudheendra Vijayanarasimhan, Austin Myers, John Canny

In this work we argue that existing metrics are not appropriate for domains such as visual description or summarization where ground truths are semantically diverse, and where the diversity in those captions captures useful additional information about the context.

speech-recognition Speech Recognition +1

Paper
Add Code

What's in a Caption? Dataset-Specific Linguistic Diversity and Its Effect on Visual Description Models and Metrics

1 code implementation • 12 May 2022 • David M. Chan, Austin Myers, Sudheendra Vijayanarasimhan, David A. Ross, Bryan Seybold, John F. Canny

While there have been significant gains in the field of automated video description, the generalization performance of automated description models to novel domains remains a major barrier to using these systems in the real world.

Video Description

Paper
Code

VideoBERT: A Joint Model for Video and Language Representation Learning

3 code implementations • ICCV 2019 • Chen Sun, Austin Myers, Carl Vondrick, Kevin Murphy, Cordelia Schmid

Self-supervised learning has become increasingly important to leverage the abundance of unlabeled data available on platforms like YouTube.

Ranked #1 on Action Classification on YouCook2

Action Classification General Classification +7

113

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.