Transformers are increasing replacing older generation of deep neural networks due to their success in a wide range of application. The dominant approach of using transformers is to pre-train them on a large training dataset and then fine-tune them on a downstream task. However, as transformers becoming larger, the fine-tuning approach is become an infeasible approach for transfer learning. In this short survey, we list a few recent methods that makes using transformers based on transfer learning more efficient.