GRITv2: Efficient and Light-weight Social Relation Recognition

11 Mar 2024  ·  N K Sagar Reddy, Neeraj Kasera, Avinash Thakur ·

Our research focuses on the analysis and improvement of the Graph-based Relation Inference Transformer (GRIT), which serves as an important benchmark in the field. We conduct a comprehensive ablation study using the PISC-fine dataset, to find and explore improvement in efficiency and performance of GRITv2. Our research has provided a new state-of-the-art relation recognition model on the PISC relation dataset. We introduce several features in the GRIT model and analyse our new benchmarks in two versions: GRITv2-L (large) and GRITv2-S (small). Our proposed GRITv2-L surpasses existing methods on relation recognition and the GRITv2-S is within 2% performance gap of GRITv2-L, which has only 0.0625x the model size and parameters of GRITv2-L. Furthermore, we also address the need for model compression, an area crucial for deploying efficient models on resource-constrained platforms. By applying quantization techniques, we efficiently reduced the GRITv2-S size to 22MB and deployed it on the flagship OnePlus 12 mobile which still surpasses the PISC-fine benchmarks in performance, highlighting the practical viability and improved efficiency of our model on mobile devices.

PDF Abstract

Datasets


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Visual Social Relationship Recognition PISC GRITv2-L mAP 80.7 # 1
mAP (Coarse) 87.8 # 1

Methods