Search Results for author: Sean Farhat

Found 1 papers, 1 papers with code

On the Surprising Efficacy of Distillation as an Alternative to Pre-Training Small Models

2 code implementations • 4 Apr 2024 • Sean Farhat, Deming Chen

We observe that, when distilled on a task from a pre-trained teacher model, a small model can achieve or surpass the performance it would achieve if it was pre-trained then finetuned on that task.

Contrastive Learning Knowledge Distillation

195

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.