no code implementations • 19 Sep 2023 • Namjun Kim, Chanho Min, Sejun Park
We next prove a lower bound on $w_{\min}$ for uniform approximation using general activation functions including ReLU: $w_{\min}\ge d_y+1$ if $d_x<d_y\le2d_x$.
no code implementations • 18 Apr 2023 • Jihyeon Seo, Kyusam Oh, Chanho Min, Yongkeun Yun, Sungwoo Cho
We propose deep collective knowledge distillation for model compression, called DCKD, which is a method for training student models with rich information to acquire knowledge from not only their teacher model but also other student models.