1 code implementation • 27 Apr 2023 • Joo Hyung Lee, Wonpyo Park, Nicole Mitchell, Jonathan Pilault, Johan Obando-Ceron, Han-Byul Kim, Namhoon Lee, Elias Frantar, Yun Long, Amir Yazdanbakhsh, Shivani Agrawal, Suvinay Subramanian, Xin Wang, Sheng-Chun Kao, Xingyao Zhang, Trevor Gale, Aart Bik, Woohyun Han, Milen Ferev, Zhonglin Han, Hong-Seok Kim, Yann Dauphin, Gintare Karolina Dziugaite, Pablo Samuel Castro, Utku Evci
This paper introduces JaxPruner, an open-source JAX-based pruning and sparse training library for machine learning research.
1 code implementation • 30 Jan 2022 • Wonpyo Park, WoongGi Chang, Donggeon Lee, Juntae Kim, Seung-won Hwang
The former loses preciseness of relative position from linearization, while the latter loses a tight integration of node-edge and node-topology interaction.
Ranked #13 on Graph Regression on PCQM4Mv2-LSC
1 code implementation • 8 Feb 2021 • Yonghyun Kim, Wonpyo Park
These allow the parameters of the embedding network to be settle on a local optima with better generalization.
no code implementations • 9 Sep 2020 • Wonpyo Park, Wonjae Kim, Kihyun You, Minsu Cho
Mutual learning is an ensemble training strategy to improve generalization by transferring individual knowledge to each other while simultaneously training multiple models.
1 code implementation • ECCV 2020 • Yonghyun Kim, Wonpyo Park, Jongju Shin
Moreover, we propose a novel compensation method to increase the number of referenced instances in the training stage.
5 code implementations • CVPR 2020 • Yonghyun Kim, Wonpyo Park, Myung-Cheol Roh, Jongju Shin
In the field of face recognition, a model learns to distinguish millions of face images with fewer dimensional embedding features, and such vast information may not be properly encoded in the conventional model with a single branch.
no code implementations • 3 Oct 2019 • Wonpyo Park, Paul Hongsuck Seo, Bohyung Han, Minsu Cho
We introduce a novel stochastic regularization technique for deep neural networks, which decomposes a layer into multiple branches with different parameters and merges stochastically sampled combinations of the outputs from the branches during training.
no code implementations • 28 May 2019 • Yoonho Lee, Wonjae Kim, Wonpyo Park, Seungjin Choi
In this paper we present a model that produces Discrete InfoMax Codes (DIMCO); we learn a probabilistic encoder that yields k-way d-dimensional codes associated with input data.
3 code implementations • CVPR 2019 • Wonpyo Park, Dongju Kim, Yan Lu, Minsu Cho
Knowledge distillation aims at transferring knowledge acquired in one model (a teacher) to another model (a student) that is typically smaller.