Search Results for author: Siddhant Bansal

Found 10 papers, 5 papers with code

HOI-Ref: Hand-Object Interaction Referral in Egocentric Vision

no code implementations • 15 Apr 2024 • Siddhant Bansal, Michael Wray, Dima Damen

Our results demonstrate that VLMs trained for referral on third person images fail to recognise and refer hands and objects in egocentric images.

Object Question Answering +1

Paper
Add Code

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives

no code implementations • 30 Nov 2023 • Kristen Grauman, Andrew Westbury, Lorenzo Torresani, Kris Kitani, Jitendra Malik, Triantafyllos Afouras, Kumar Ashutosh, Vijay Baiyya, Siddhant Bansal, Bikram Boote, Eugene Byrne, Zach Chavis, Joya Chen, Feng Cheng, Fu-Jen Chu, Sean Crane, Avijit Dasgupta, Jing Dong, Maria Escobar, Cristhian Forigua, Abrham Gebreselasie, Sanjay Haresh, Jing Huang, Md Mohaiminul Islam, Suyog Jain, Rawal Khirodkar, Devansh Kukreja, Kevin J Liang, Jia-Wei Liu, Sagnik Majumder, Yongsen Mao, Miguel Martin, Effrosyni Mavroudi, Tushar Nagarajan, Francesco Ragusa, Santhosh Kumar Ramakrishnan, Luigi Seminara, Arjun Somayazulu, Yale Song, Shan Su, Zihui Xue, Edward Zhang, Jinxu Zhang, Angela Castillo, Changan Chen, Xinzhu Fu, Ryosuke Furuta, Cristina Gonzalez, Prince Gupta, Jiabo Hu, Yifei HUANG, Yiming Huang, Weslie Khoo, Anush Kumar, Robert Kuo, Sach Lakhavani, Miao Liu, Mi Luo, Zhengyi Luo, Brighid Meredith, Austin Miller, Oluwatumininu Oguntola, Xiaqing Pan, Penny Peng, Shraman Pramanick, Merey Ramazanova, Fiona Ryan, Wei Shan, Kiran Somasundaram, Chenan Song, Audrey Southerland, Masatoshi Tateno, Huiyu Wang, Yuchen Wang, Takuma Yagi, Mingfei Yan, Xitong Yang, Zecheng Yu, Shengxin Cindy Zha, Chen Zhao, Ziwei Zhao, Zhifan Zhu, Jeff Zhuo, Pablo Arbelaez, Gedas Bertasius, David Crandall, Dima Damen, Jakob Engel, Giovanni Maria Farinella, Antonino Furnari, Bernard Ghanem, Judy Hoffman, C. V. Jawahar, Richard Newcombe, Hyun Soo Park, James M. Rehg, Yoichi Sato, Manolis Savva, Jianbo Shi, Mike Zheng Shou, Michael Wray

We present Ego-Exo4D, a diverse, large-scale multimodal multiview video dataset and benchmark challenge.

Video Understanding

Paper
Add Code

United We Stand, Divided We Fall: UnityGraph for Unsupervised Procedure Learning from Videos

no code implementations • 6 Nov 2023 • Siddhant Bansal, Chetan Arora, C. V. Jawahar

Given multiple videos of the same task, procedure learning addresses identifying the key-steps and determining their order to perform the task.

Procedure Learning

Paper
Add Code

An Outlook into the Future of Egocentric Vision

no code implementations • 14 Aug 2023 • Chiara Plizzari, Gabriele Goletto, Antonino Furnari, Siddhant Bansal, Francesco Ragusa, Giovanni Maria Farinella, Dima Damen, Tatiana Tommasi

What will the future be?

Paper
Add Code

My View is the Best View: Procedure Learning from Egocentric Videos

1 code implementation • 22 Jul 2022 • Siddhant Bansal, Chetan Arora, C. V. Jawahar

Instead, we propose to use the signal provided by the temporal correspondences between key-steps across videos.

Procedure Learning

Paper
Code

Making AI 'Smart': Bridging AI and Cognitive Science

no code implementations • 31 Dec 2021 • Madhav Agarwal, Siddhant Bansal

This will help develop more powerful AI systems and simultaneously gives us a better understanding of how the human brain works.

Paper
Add Code

Ego4D: Around the World in 3,000 Hours of Egocentric Video

6 code implementations • CVPR 2022 • Kristen Grauman, Andrew Westbury, Eugene Byrne, Zachary Chavis, Antonino Furnari, Rohit Girdhar, Jackson Hamburger, Hao Jiang, Miao Liu, Xingyu Liu, Miguel Martin, Tushar Nagarajan, Ilija Radosavovic, Santhosh Kumar Ramakrishnan, Fiona Ryan, Jayant Sharma, Michael Wray, Mengmeng Xu, Eric Zhongcong Xu, Chen Zhao, Siddhant Bansal, Dhruv Batra, Vincent Cartillier, Sean Crane, Tien Do, Morrie Doulaty, Akshay Erapalli, Christoph Feichtenhofer, Adriano Fragomeni, Qichen Fu, Abrham Gebreselasie, Cristina Gonzalez, James Hillis, Xuhua Huang, Yifei HUANG, Wenqi Jia, Weslie Khoo, Jachym Kolar, Satwik Kottur, Anurag Kumar, Federico Landini, Chao Li, Yanghao Li, Zhenqiang Li, Karttikeya Mangalam, Raghava Modhugu, Jonathan Munro, Tullie Murrell, Takumi Nishiyasu, Will Price, Paola Ruiz Puentes, Merey Ramazanova, Leda Sari, Kiran Somasundaram, Audrey Southerland, Yusuke Sugano, Ruijie Tao, Minh Vo, Yuchen Wang, Xindi Wu, Takuma Yagi, Ziwei Zhao, Yunyi Zhu, Pablo Arbelaez, David Crandall, Dima Damen, Giovanni Maria Farinella, Christian Fuegen, Bernard Ghanem, Vamsi Krishna Ithapu, C. V. Jawahar, Hanbyul Joo, Kris Kitani, Haizhou Li, Richard Newcombe, Aude Oliva, Hyun Soo Park, James M. Rehg, Yoichi Sato, Jianbo Shi, Mike Zheng Shou, Antonio Torralba, Lorenzo Torresani, Mingfei Yan, Jitendra Malik

We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite.

De-identification Ethics

5,040

Paper
Code

Improving Word Recognition using Multiple Hypotheses and Deep Embeddings

1 code implementation • 27 Oct 2020 • Siddhant Bansal, Praveen Krishnan, C. V. Jawahar

We propose a novel scheme for improving the word recognition accuracy using word image embeddings.

Paper
Code

Fused Text Recogniser and Deep Embeddings Improve Word Recognition and Retrieval

1 code implementation • 1 Jul 2020 • Siddhant Bansal, Praveen Krishnan, C. V. Jawahar

Recognition and retrieval of textual content from the large document collections have been a powerful use case for the document image analysis community.

Optical Character Recognition (OCR) Retrieval

Paper
Code

AGDC: Automatic Garbage Detection and Collection

1 code implementation • 16 Aug 2019 • Siddhant Bansal, Seema Patel, Ishita Shah, Prof. Alpesh Patel, Prof. Jagruti Makwana, Dr. Rajesh Thakker

Waste management is one of the significant problems throughout the world.

Robotics

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.