no code implementations • 6 Jul 2021 • Dumindu Tissera, Rukshan Wijessinghe, Kasun Vithanage, Alex Xavier, Subha Fernando, Ranga Rodrigo
Having multiple parallel convolutional/dense operations in each layer solves this problem, but without any context-dependent allocation of resources among these operations: the parallel computations tend to learn similar features making the widening process less effective.