no code implementations • CVPR 2023 • Brandon Clark, Alec Kerrigan, Parth Parag Kulkarni, Vicente Vivanco Cepeda, Mubarak Shah
To this end, we introduce an end-to-end transformer-based architecture that exploits the relationship between different geographic levels (which we refer to as hierarchies) and the corresponding visual scene information in an image through hierarchical cross-attention.
Ranked #1 on Photo geolocation estimation on GWS15k
no code implementations • 14 Oct 2021 • Ishan Dave, Naman Biyani, Brandon Clark, Rohit Gupta, Yogesh Rawat, Mubarak Shah
This technical report presents our approach "Knights" to solve the action recognition task on a small subset of Kinetics-400 i. e. Kinetics400ViPriors without using any extra-data.