Cross-Image Context Matters for Bongard Problems

7 Sep 2023  ·  Nikhil Raghuraman, Adam W. Harley, Leonidas Guibas ·

Current machine learning methods struggle to solve Bongard problems, which are a type of IQ test that requires deriving an abstract "concept" from a set of positive and negative "support" images, and then classifying whether or not a new query image depicts the key concept. On Bongard-HOI, a benchmark for natural-image Bongard problems, existing methods have only reached 66% accuracy (where chance is 50%). Low accuracy is often attributed to neural nets' lack of ability to find human-like symbolic rules. In this work, we point out that many existing methods are forfeiting accuracy due to a much simpler problem: they do not incorporate information contained in the support set as a whole, and rely instead on information extracted from individual supports. This is a critical issue, because unlike in few-shot learning tasks concerning object classification, the "key concept" in a typical Bongard problem can only be distinguished using multiple positives and multiple negatives. We explore a variety of simple methods to take this cross-image context into account, and demonstrate substantial gains over prior methods, leading to new state-of-the-art performance on Bongard-LOGO (75.3%) and Bongard-HOI (72.45%) and strong performance on the original Bongard problem set (60.84%).

PDF Abstract

Datasets


Results from the Paper


Ranked #2 on Few-Shot Image Classification on Bongard-HOI (using extra training data)

     Get a GitHub badge
Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
Few-Shot Image Classification Bongard-HOI SVM-Mimic + PMF (fine-tuned CLIP RN-50) Avg. Accuracy 76.41 # 2
Few-Shot Image Classification Bongard-HOI SVM-Mimic (frozen CLIP RN-50) Avg. Accuracy 72.45 # 3

Methods


No methods listed for this paper. Add relevant methods here