no code implementations • 24 Nov 2023 • Seonghak Kim, Gyeongdo Ham, SuIn Lee, Donggon Jang, Daeshik Kim
To distill optimal knowledge by adjusting non-target class predictions, we apply a higher temperature to low energy samples to create smoother distributions and a lower temperature to high energy samples to achieve sharper distributions.
no code implementations • 24 Nov 2023 • Gyeongdo Ham, Seonghak Kim, SuIn Lee, Jae-Hyeok Lee, Daeshik Kim
Furthermore, we propose a method called cosine similarity weighted temperature (CSWT) to improve the performance.