TraVLR is a synthetic dataset comprising four visio-linguistic reasoning tasks. Each example encodes the scene bimodally such that either modality can be dropped during training/testing with no loss of relevant information. TraVLR's training and testing distributions are also constrained along task-relevant dimensions, enabling the evaluation of out-of-distribution generalisation.
Paper | Code | Results | Date | Stars |
---|