no code implementations • 22 Dec 2020 • Teguh Budianto, Tomohiro Nakai, Kazunori Imoto, Takahiro Takimoto, Kosuke Haruki
Through the learning mechanism, the proposed method aims to reduce the problem of bad cycle consistency, in which a bidirectional GAN might not be able to reproduce samples with a large difference between normal and abnormal samples.
no code implementations • 26 Jun 2019 • Kosuke Haruki, Taiji Suzuki, Yohei Hamakawa, Takeshi Toda, Ryuji Sakai, Masahiro Ozawa, Mitsuhiro Kimura
Large-batch stochastic gradient descent (SGD) is widely used for training in distributed deep learning because of its training-time efficiency, however, extremely large-batch SGD leads to poor generalization and easily converges to sharp minima, which prevents naive large-scale data-parallel SGD (DP-SGD) from converging to good minima.