Transfer Learning
2819 papers with code • 7 benchmarks • 14 datasets
Transfer Learning is a machine learning technique where a model trained on one task is re-purposed and fine-tuned for a related, but different task. The idea behind transfer learning is to leverage the knowledge learned from a pre-trained model to solve a new, but related problem. This can be useful in situations where there is limited data available to train a new model from scratch, or when the new task is similar enough to the original task that the pre-trained model can be adapted to the new problem with only minor modifications.
( Image credit: Subodh Malgonde )
Libraries
Use these libraries to find Transfer Learning models and implementationsDatasets
Subtasks
Most implemented papers
Unsupervised Domain Adaptation by Backpropagation
Here, we propose a new approach to domain adaptation in deep architectures that can be trained on large amount of labeled data from the source domain and large amount of unlabeled data from the target domain (no labeled target-domain data is necessary).
TransferTransfo: A Transfer Learning Approach for Neural Network Based Conversational Agents
We introduce a new approach to generative data-driven dialogue systems (e. g. chatbots) called TransferTransfo which is a combination of a Transfer learning based training scheme and a high-capacity Transformer model.
Unsupervised Data Augmentation for Consistency Training
In this work, we present a new perspective on how to effectively noise unlabeled examples and argue that the quality of noising, specifically those produced by advanced data augmentation methods, plays a crucial role in semi-supervised learning.
Going deeper with Image Transformers
In particular, we investigate the interplay of architecture and optimization of such dedicated transformers.
Parameter-Efficient Transfer Learning for NLP
On GLUE, we attain within 0. 4% of the performance of full fine-tuning, adding only 3. 6% parameters per task.
FNet: Mixing Tokens with Fourier Transforms
At longer input lengths, our FNet model is significantly faster: when compared to the "efficient" Transformers on the Long Range Arena benchmark, FNet matches the accuracy of the most accurate models, while outpacing the fastest models across all sequence lengths on GPUs (and across relatively shorter lengths on TPUs).
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
For natural language understanding (NLU) technology to be maximally useful, both practically and as a scientific object of study, it must be general: it must be able to process language in a way that is not exclusively tailored to any one specific task or dataset.
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
Clone a voice in 5 seconds to generate arbitrary speech in real-time
Rethinking Channel Dimensions for Efficient Model Design
We then investigate the channel configuration of a model by searching network architectures concerning the channel configuration under the computational cost restriction.
DeiT III: Revenge of the ViT
Our evaluations on Image classification (ImageNet-1k with and without pre-training on ImageNet-21k), transfer learning and semantic segmentation show that our procedure outperforms by a large margin previous fully supervised training recipes for ViT.