2 code implementations • 9 Apr 2024 • David Kurzendörfer, Otniel-Bogdan Mercea, A. Sophia Koepke, Zeynep Akata
However, existing benchmarks predate the popularization of large multi-modal models, such as CLIP and CLAP.
Audio Classification Generalized Zero-Shot Learning