An Efficient Training Framework for Reversible Neural Architectures

As machine learning models and dataset escalate in scales rapidly, the huge memory footprint impedes efficient training. Reversible operators can reduce memory consumption by discarding intermediate feature maps in forward computations and recover them via their inverse functions in the backward propagation. They save memory at the cost of computation overhead. However, current implementations of reversible layers mainly focus on saving memory usage with computation overhead neglected. In this work, we formulate the decision problem for reversible operators with training time as the objective function and memory usage as the constraint. By solving this problem, we can maximize the training throughput for reversible neural architectures. Our proposed framework fully automates this decision process, empowering researchers to develop and train reversible neural networks more efficiently.

PDF Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here