2 code implementations • 17 Feb 2024 • Benjamin Feuer, Robin Tibor Schirrmeister, Valeriia Cherepanova, Chinmay Hegde, Frank Hutter, Micah Goldblum, Niv Cohen, Colin White
Similar to large language models, PFNs make use of pretraining and in-context learning to achieve strong performance on new tasks in a single forward pass.
no code implementations • 17 Nov 2023 • Benjamin Feuer, Chinmay Hegde, Niv Cohen
Tabular classification has traditionally relied on supervised algorithms, which estimate the parameters of a prediction model using its training data.
no code implementations • 7 Nov 2023 • Benjamin Feuer, Chinmay Hegde
Modern computer vision foundation models are trained on massive amounts of data, incurring large economic and environmental costs.
1 code implementation • 27 Oct 2023 • Benjamin Feuer, Yurong Liu, Chinmay Hegde, Juliana Freire
We introduce ArcheType, a simple, practical method for context sampling, prompt serialization, model querying, and label remapping, which enables large language models to solve CTA problems in a fully zero-shot manner.
Ranked #1 on Column Type Annotation on WDC SOTAB (Weighted F1 metric)
1 code implementation • 7 Aug 2023 • Benjamin Feuer, Ameya Joshi, Minh Pham, Chinmay Hegde
To our knowledge, this is the first result showing (near) state-of-the-art distributional robustness on limited data budgets.
1 code implementation • NeurIPS 2023 • Duncan McElfresh, Sujay Khandagale, Jonathan Valverde, Vishak Prasad C, Benjamin Feuer, Chinmay Hegde, Ganesh Ramakrishnan, Micah Goldblum, Colin White
To this end, we conduct the largest tabular data analysis to date, comparing 19 algorithms across 176 datasets, and we find that the 'NN vs. GBDT' debate is overemphasized: for a surprisingly high number of datasets, either the performance difference between GBDTs and NNs is negligible, or light hyperparameter tuning on a GBDT is more important than choosing between NNs and GBDTs.
1 code implementation • 12 Feb 2023 • Andre Nakkab, Benjamin Feuer, Chinmay Hegde
Recent advances in training vision-language models have demonstrated unprecedented robustness and transfer learning effectiveness; however, standard computer vision datasets are image-only, and therefore not well adapted to such training methods.
1 code implementation • 13 Oct 2022 • Benjamin Feuer, Ameya Joshi, Chinmay Hegde
Vision language (VL) models like CLIP are robust to natural distribution shifts, in part because CLIP learns on unstructured data using a technique called caption supervision; the model inteprets image-linked texts as ground-truth labels.
no code implementations • 15 Jun 2022 • Benjamin Feuer, Ameya Joshi, Chinmay Hegde
State-of-the-art image classifiers trained on massive datasets (such as ImageNet) have been shown to be vulnerable to a range of both intentional and incidental distribution shifts.