no code implementations • 15 Oct 2020 • Vladislav Lialin, Rahul Goel, Andrey Simanovsky, Anna Rumshisky, Rushin Shah
To reduce training time, one can fine-tune the previously trained model on each patch, but naive fine-tuning exhibits catastrophic forgetting - degradation of the model performance on the data not represented in the data patch.
no code implementations • 20 Oct 2016 • Alexander Ulanov, Andrey Simanovsky, Manish Marwah
It is implemented in a distributed fashion in order to address these scalability issues.