Search Results for author: Kumar Ashutosh

Found 11 papers, 1 papers with code

SoundingActions: Learning How Actions Sound from Narrated Egocentric Videos

no code implementations8 Apr 2024 Changan Chen, Kumar Ashutosh, Rohit Girdhar, David Harwath, Kristen Grauman

We propose a novel self-supervised embedding to learn how actions sound from narrated in-the-wild egocentric videos.

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives

no code implementations30 Nov 2023 Kristen Grauman, Andrew Westbury, Lorenzo Torresani, Kris Kitani, Jitendra Malik, Triantafyllos Afouras, Kumar Ashutosh, Vijay Baiyya, Siddhant Bansal, Bikram Boote, Eugene Byrne, Zach Chavis, Joya Chen, Feng Cheng, Fu-Jen Chu, Sean Crane, Avijit Dasgupta, Jing Dong, Maria Escobar, Cristhian Forigua, Abrham Gebreselasie, Sanjay Haresh, Jing Huang, Md Mohaiminul Islam, Suyog Jain, Rawal Khirodkar, Devansh Kukreja, Kevin J Liang, Jia-Wei Liu, Sagnik Majumder, Yongsen Mao, Miguel Martin, Effrosyni Mavroudi, Tushar Nagarajan, Francesco Ragusa, Santhosh Kumar Ramakrishnan, Luigi Seminara, Arjun Somayazulu, Yale Song, Shan Su, Zihui Xue, Edward Zhang, Jinxu Zhang, Angela Castillo, Changan Chen, Xinzhu Fu, Ryosuke Furuta, Cristina Gonzalez, Prince Gupta, Jiabo Hu, Yifei HUANG, Yiming Huang, Weslie Khoo, Anush Kumar, Robert Kuo, Sach Lakhavani, Miao Liu, Mi Luo, Zhengyi Luo, Brighid Meredith, Austin Miller, Oluwatumininu Oguntola, Xiaqing Pan, Penny Peng, Shraman Pramanick, Merey Ramazanova, Fiona Ryan, Wei Shan, Kiran Somasundaram, Chenan Song, Audrey Southerland, Masatoshi Tateno, Huiyu Wang, Yuchen Wang, Takuma Yagi, Mingfei Yan, Xitong Yang, Zecheng Yu, Shengxin Cindy Zha, Chen Zhao, Ziwei Zhao, Zhifan Zhu, Jeff Zhuo, Pablo Arbelaez, Gedas Bertasius, David Crandall, Dima Damen, Jakob Engel, Giovanni Maria Farinella, Antonino Furnari, Bernard Ghanem, Judy Hoffman, C. V. Jawahar, Richard Newcombe, Hyun Soo Park, James M. Rehg, Yoichi Sato, Manolis Savva, Jianbo Shi, Mike Zheng Shou, Michael Wray

We present Ego-Exo4D, a diverse, large-scale multimodal multiview video dataset and benchmark challenge.

Video Understanding

HierVL: Learning Hierarchical Video-Language Embeddings

no code implementations CVPR 2023 Kumar Ashutosh, Rohit Girdhar, Lorenzo Torresani, Kristen Grauman

Video-language embeddings are a promising avenue for injecting semantics into visual representations, but existing methods capture only short-term associations between seconds-long video clips and their accompanying text.

Action Classification Action Recognition +3

What You Say Is What You Show: Visual Narration Detection in Instructional Videos

no code implementations5 Jan 2023 Kumar Ashutosh, Rohit Girdhar, Lorenzo Torresani, Kristen Grauman

Narrated ''how-to'' videos have emerged as a promising data source for a wide range of learning problems, from learning visual representations to training robot policies.

RoS-KD: A Robust Stochastic Knowledge Distillation Approach for Noisy Medical Imaging

no code implementations15 Oct 2022 Ajay Jaiswal, Kumar Ashutosh, Justin F Rousseau, Yifan Peng, Zhangyang Wang, Ying Ding

Our extensive experiments on popular medical imaging classification tasks (cardiopulmonary disease and lesion classification) using real-world datasets, show the performance benefit of RoS-KD, its ability to distill knowledge from many popular large networks (ResNet-50, DenseNet-121, MobileNet-V2) in a comparatively small network, and its robustness to adversarial attacks (PGD, FSGM).

Classification Knowledge Distillation +1

3D-NVS: A 3D Supervision Approach for Next View Selection

no code implementations3 Dec 2020 Kumar Ashutosh, Saurabh Kumar, Subhasis Chaudhuri

We present a classification based approach for the next best view selection and show how we can plausibly obtain a supervisory signal for this task.

3D Reconstruction

Lower Bounds for Policy Iteration on Multi-action MDPs

no code implementations16 Sep 2020 Kumar Ashutosh, Sarthak Consul, Bhishma Dedhia, Parthasarathi Khirwadkar, Sahil Shah, Shivaram Kalyanakrishnan

An important theoretical question is how many iterations a specified PI variant will take to terminate as a function of the number of states $n$ and the number of actions $k$ in the input MDP.

Bandit algorithms: Letting go of logarithmic regret for statistical robustness

1 code implementation22 Jun 2020 Kumar Ashutosh, Jayakrishnan Nair, Anmol Kagrecha, Krishna Jagannathan

We study regret minimization in a stochastic multi-armed bandit setting and establish a fundamental trade-off between the regret suffered under an algorithm, and its statistical robustness.

Analysis of Lower Bounds for Simple Policy Iteration

no code implementations28 Nov 2019 Sarthak Consul, Bhishma Dedhia, Kumar Ashutosh, Parthasarathi Khirwadkar

We generalize the previous result and prove a novel exponential lower bound on the number of iterations taken by policy iteration for $N-$state, $k-$action MDPs.

Cannot find the paper you are looking for? You can Submit a new open access paper.