no code implementations • 29 Jun 2023 • Rinor Cakaj, Jens Mehnert, Bin Yang
However, we show experimentally that, despite the approximate additive penalty of BN, feature maps in deep neural networks (DNNs) tend to explode at the beginning of the network and that feature maps of DNNs contain large values during the whole training.
no code implementations • 29 Jun 2023 • Rinor Cakaj, Jens Mehnert, Bin Yang
Large weights in deep neural networks are a sign of a more complex network that is overfitted to the training data.