Submitted:
09 February 2025
Posted:
11 February 2025
You are already at the latest version
Abstract
Low-Rank Adaptation (LoRA) is a computationally efficient approach for fine-tuning large pre-trained language models, designed to reduce memory and computational overhead by introducing low-rank matrices into the model’s weight updates. This survey provides a comprehensive overview of LoRA, including its theoretical foundations, applications, and the advantages it offers over traditional fine-tuning methods. We explore how LoRA enables efficient task adaptation in scenarios such as domain adaptation, few-shot learning, transfer learning, and zero-shot learning. Additionally, we examine its challenges, such as rank selection, generalization to complex tasks, and risks of overfitting, while identifying key areas for future research, including adaptive rank selection, integration with other fine-tuning techniques, and multi-modal and cross-domain adaptation. LoRA's potential to make large-scale models more adaptable and efficient is significant for advancing natural language processing (NLP) and machine learning applications, especially when computational resources are limited. This survey aims to highlight the current state of LoRA, its practical applications, and the ongoing research opportunities to further enhance its capabilities.
Keywords:
1. Introduction
2. Background
2.1. Large Language Models
2.2. Challenges in Fine-Tuning Large Language Models
2.3. Low-Rank Approximation in Neural Networks
2.4. Low-Rank Adaptation (LoRA)
2.5. Advantages of LoRA
2.6. Related Techniques
3. Related Work
3.1. Adapter Networks
3.2. Prompt-Tuning
3.3. Low-Rank Approximation in Neural Networks
3.4. Parameter-Efficient Transfer Learning
3.5. Comparative Analysis of LoRA and Related Approaches
3.6. Summary
4. Applications of LoRA
4.1. Domain Adaptation
4.2. Few-Shot Learning
4.3. Transfer Learning
4.4. Zero-Shot Learning
4.5. Real-World Case Studies
4.6. Summary
5. Challenges and Limitations of LoRA
5.1. Generalization to Complex Tasks
5.2. Rank Selection and Trade-offs
5.3. Task-Specific Fine-Tuning
5.4. Implementation and Scalability Issues
5.5. Task Transfer and Adaptation Across Domains
5.6. Overfitting in Small Datasets
5.7. Summary of Challenges and Limitations
6. Future Directions and Research Opportunities
6.1. Adaptive Rank Selection
6.2. Integration with Other Fine-Tuning Techniques
6.3. Domain-Specific and Cross-Domain Adaptation
6.4. Multi-Modal Learning and LoRA
6.5. Model Robustness and Generalization
6.6. LoRA in Federated Learning and Privacy-Preserving AI
6.7. Benchmarking and Standardization
6.8. Summary
7. Conclusion
References
- Ge, Y.; Ge, Y.; Zeng, Z.; Wang, X.; Shan, Y. Planting a SEED of Vision in Large Language Model. arXiv preprint 2023, arXiv:2307.08041. [Google Scholar]
- Sakaguchi, K.; Bras, R.L.; Bhagavatula, C.; Choi, Y. Winogrande: An adversarial winograd schema challenge at scale. Communications of the ACM 2021, 64, 99–106. [Google Scholar] [CrossRef]
- Xiong, W.; Liu, J.; Molybog, I.; Zhang, H.; Bhargava, P.; Hou, R.; Martin, L.; Rungta, R.; Sankararaman, K.A.; Oguz, B.; et al. Effective Long-Context Scaling of Foundation Models. arXiv preprint, 2023; arXiv:2309.16039. [Google Scholar]
- Quan, S. DMoERM: Recipes of Mixture-of-Experts for Effective Reward Modeling. arXiv preprint 2024, arXiv:2403.01197. [Google Scholar]
- Yang, S.; Zhou, Y.; Liu, Z.; Loy, C.C. Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation. In Proceedings of the SIGGRAPH Asia 2023 Conference Papers; 2023; pp. 1–11. [Google Scholar]
- Wang, H.; Xiang, X.; Fan, Y.; Xue, J. Customizing 360-Degree Panoramas through Text-to-Image Diffusion Models. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024; pp. 4933–4943.
- Hayou, S.; Ghosh, N.; Yu, B. The Impact of Initialization on LoRA Finetuning Dynamics. arXiv preprint 2024, arXiv:2406.08447. [Google Scholar]
- Li, Z.; Li, X.; Liu, Y.; Xie, H.; Li, J.; Wang, F.L.; Li, Q.; Zhong, X. Label Supervised LLaMA Finetuning. arXiv preprint 2023, arXiv:2310.01208. [Google Scholar]
- Edalati, A.; Tahaei, M.S.; Kobyzev, I.; Nia, V.P.; Clark, J.J.; Rezagholizadeh, M. KronA: Parameter Efficient Tuning with Kronecker Adapter. arXiv preprint, 2022; arXiv:2212.10650. [Google Scholar]
- Zhou, X.; Sun, Z.; Li, G. Db-gpt: Large language model meets database. Data Science and Engineering 2024, 9, 102–111. [Google Scholar] [CrossRef]
- Liu, Y.; Yu, C.; Shang, L.; He, Y.; Wu, Z.; Wang, X.; Xu, C.; Xie, H.; Wang, W.; Zhao, Y.; et al. FaceChain: A Playground for Human-centric Artificial Intelligence Generated Content. arXiv preprint 2023, arXiv:2308.14256. [Google Scholar]
- Wang, X.; Aitchison, L.; Rudolph, M. LoRA ensembles for large language model fine-tuning. arXiv preprint, 2023; arXiv:2310.00035. [Google Scholar]
- Chen, S.; Jie, Z.; Ma, L. LLaVA-MoLE: Sparse Mixture of LoRA Experts for Mitigating Data Conflicts in Instruction Finetuning MLLMs. arXiv preprint, 2024; arXiv:2401.16160. [Google Scholar]
- Zniyed, Y.; Nguyen, T.P.; et al. Efficient tensor decomposition-based filter pruning. Neural Networks 2024, 178, 106393. [Google Scholar]
- Zhang, Y.; Xu, Q.; Zhang, L. DragTex: Generative Point-Based Texture Editing on 3D Mesh. arXiv preprint 2024, arXiv:2403.02217. [Google Scholar]
- Gu, Y.; Wang, X.; Wu, J.Z.; Shi, Y.; Chen, Y.; Fan, Z.; Xiao, W.; Zhao, R.; Chang, S.; Wu, W.; et al. Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models. In Proceedings of the Advances in Neural Information Processing Systems; 2023. [Google Scholar]
- Luo, Z.; Xu, X.; Liu, F.; Koh, Y.S.; Wang, D.; Zhang, J. Privacy-Preserving Low-Rank Adaptation for Latent Diffusion Models. arXiv preprint, 2024; arXiv:2402.11989. [Google Scholar]
- Gao, Y.; Xiong, Y.; Gao, X.; Jia, K.; Pan, J.; Bi, Y.; Dai, Y.; Sun, J.; Guo, Q.; Wang, M.; et al. Retrieval-Augmented Generation for Large Language Models: A Survey. arXiv preprint, 2023; arXiv:2312.10997. [Google Scholar]
- Frenkel, Y.; Vinker, Y.; Shamir, A.; Cohen-Or, D. Implicit Style-Content Separation using B-LoRA. arXiv preprint 2024, arXiv:2403.14572. [Google Scholar]
- Ren, W.; Li, X.; Wang, L.; Zhao, T.; Qin, W. Analyzing and Reducing Catastrophic Forgetting in Parameter Efficient Tuning. arXiv preprint 2024, arXiv:2402.18865. [Google Scholar]
- Ye, Z.; Li, D.; Tian, J.; Lan, T.; Zuo, J.; Duan, L.; Lu, H.; Jiang, Y.; Sha, J.; Zhang, K.; et al. ASPEN: High-Throughput LoRA Fine-Tuning of Large Language Models with a Single GPU. arXiv preprint 2023, arXiv:2312.02515. [Google Scholar]
- Jacot, A.; Hongler, C.; Gabriel, F. Neural Tangent Kernel: Convergence and Generalization in Neural Networks. In Proceedings of the Annual Conference on Neural Information Processing Systems; 2018; pp. 8580–8589. [Google Scholar]
- Tan, W.; Zhang, W.; Liu, S.; Zheng, L.; Wang, X.; An, B. True Knowledge Comes from Practice: Aligning LLMs with Embodied Environments via Reinforcement Learning. arXiv preprint 2024, arXiv:2401.14151. [Google Scholar]
- Silva, A.; Fang, S.; Monperrus, M. Repairllama: Efficient representations and fine-tuned adapters for program repair. arXiv preprint 2023, arXiv:2312.15698. [Google Scholar]
- Zhang, Y.; Wang, M.; Wu, Y.; Tiwari, P.; Li, Q.; Wang, B.; Qin, J. DialogueLLM: Context and Emotion Knowledge-Tuned Large Language Models for Emotion Recognition in Conversations. arXiv preprint 2024, arXiv:2310.11374. [Google Scholar]
- Jin, Z.; Song, Z. Generating coherent comic with rich story using ChatGPT and Stable Diffusion. arXiv preprint 2023, arXiv:2305.11067. [Google Scholar]
- Sui, Y.; Yin, M.; Gong, Y.; Xiao, J.; Phan, H.; Yuan, B. ELRT: Efficient Low-Rank Training for Compact Convolutional Neural Networks. arXiv preprint, 2024; arXiv:2401.10341. [Google Scholar]
- Jiang, W.; Lin, B.; Shi, H.; Zhang, Y.; Li, Z.; Kwok, J.T. Effective and Parameter-Efficient Reusing Fine-Tuned Models. arXiv preprint, 2023; arXiv:2310.01886. [Google Scholar]
- Liu, Y.; An, C.; Qiu, X. $$\cal{Y}$$-Tuning: an efficient tuning paradigm for large-scale pre-trained models via label representation learning. Frontiers Comput. Sci. 2024, 18. [Google Scholar] [CrossRef]
- Elsken, T.; Metzen, J.H.; Hutter, F. Neural Architecture Search: A Survey. J. Mach. Learn. Res. 2019, 20, 55:1–55:21. [Google Scholar]
- Malladi, S.; Wettig, A.; Yu, D.; Chen, D.; Arora, S. A Kernel-Based View of Language Model Fine-Tuning. In Proceedings of the International Conference on Machine Learning; 2023; pp. 23610–23641. [Google Scholar]
- Liang, Y.; Li, W. InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning. arXiv preprint, 2024; arXiv:2404.00228. [Google Scholar]
- Konstantinidis, T.; Iacovides, G.; Xu, M.; Constantinides, T.G.; Mandic, D.P. FinLlama: Financial Sentiment Classification for Algorithmic Trading Applications. arXiv preprint 2024, arXiv:2403.12285. [Google Scholar]
- Kong, Z.; Zhang, Y.; Yang, T.; Wang, T.; Zhang, K.; Wu, B.; Chen, G.; Liu, W.; Luo, W. OMG: Occlusion-friendly Personalized Multi-concept Generation in Diffusion Models. arXiv preprint 2024, arXiv:2403.10983. [Google Scholar]
- He, X.; Li, C.; Zhang, P.; Yang, J.; Wang, X.E. Parameter-Efficient Model Adaptation for Vision Transformers. In Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence; 2023; pp. 817–825. [Google Scholar]
- Koubbi, H.; Boussard, M.; Hernandez, L. The Impact of LoRA on the Emergence of Clusters in Transformers. arXiv preprint, 2024; arXiv:2402.15415. [Google Scholar]
- Khandelwal, A. InFusion: Inject and Attention Fusion for Multi Concept Zero-Shot Text-based Video Editing. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023; pp. 3017–3026.
- Fomenko, V.; Yu, H.; Lee, J.; Hsieh, S.; Chen, W. A Note on LoRA. arXiv preprint 2024, arXiv:2404.05086. [Google Scholar]
- Zhao, J.; Wang, T.; Abid, W.; Angus, G.; Garg, A.; Kinnison, J.; Sherstinsky, A.; Molino, P.; Addair, T.; Rishi, D. LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report. arXiv preprint, 2024; arXiv:2405.00732. [Google Scholar]
- Zhu, J.; Greenewald, K.H.; Nadjahi, K.; de Ocáriz Borde, H.S.; Gabrielsson, R.B.; Choshen, L.; Ghassemi, M.; Yurochkin, M.; Solomon, J. Asymmetry in Low-Rank Adapters of Foundation Models. arXiv preprint, 2024; arXiv:2402.16842. [Google Scholar]
- Mujadia, V.; Urlana, A.; Bhaskar, Y.; Pavani, P.A.; Shravya, K.; Krishnamurthy, P.; Sharma, D.M. Assessing Translation Capabilities of Large Language Models Involving English and Indian Languages. arXiv preprint 2023, arXiv:2311.09216. [Google Scholar]
- Yoo, S.; Kim, K.; Kim, V.G.; Sung, M. As-Plausible-As-Possible: Plausibility-Aware Mesh Deformation Using 2D Diffusion Priors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024; pp. 4315–4324.
- Ma, Y.; Fan, Y.; Ji, J.; Wang, H.; Sun, X.; Jiang, G.; Shu, A.; Ji, R. X-Dreamer: Creating High-quality 3D Content by Bridging the Domain Gap Between Text-to-2D and Text-to-3D Generation. arXiv preprint 2023, arXiv:2312.00085. [Google Scholar]
- Wang, Y.; Lin, Y.; Zeng, X.; Zhang, G. MultiLoRA: Democratizing LoRA for Better Multi-Task Learning. arXiv preprint, 2023; arXiv:2311.11501. [Google Scholar]
- Liu, T.; Low, B.K.H. Goat: Fine-tuned LLaMA Outperforms GPT-4 on Arithmetic Tasks. arXiv preprint 2023, arXiv:2305.14201. [Google Scholar]
- Shrestha, S.; Venkataramanan, A.; et al. Style Transfer to Calvin and Hobbes comics using Stable Diffusion. arXiv preprint 2023, arXiv:2312.03993. [Google Scholar]
- Liu, S.; Wang, C.; Yin, H.; Molchanov, P.; Wang, Y.F.; Cheng, K.; Chen, M. DoRA: Weight-Decomposed Low-Rank Adaptation. arXiv preprint, 2024; arXiv:2402.09353. [Google Scholar]
- Yang, S.; Ali, M.A.; Wang, C.; Hu, L.; Wang, D. MoRAL: MoE Augmented LoRA for LLMs’ Lifelong Learning. arXiv preprint, 2024; arXiv:2402.11260. [Google Scholar]
- Zniyed, Y.; Nguyen, T.P.; et al. Enhanced network compression through tensor decompositions and pruning. IEEE Transactions on Neural Networks and Learning Systems 2024. [Google Scholar]
- Luo, S.; Tan, Y.; Patil, S.; Gu, D.; von Platen, P.; Passos, A.; Huang, L.; Li, J.; Zhao, H. LCM-LoRA: A Universal Stable-Diffusion Acceleration Module. arXiv preprint 2023, arXiv:2311.05556. [Google Scholar]
- Zhang, K.; Liu, D. Customized Segment Anything Model for Medical Image Segmentation. arXiv preprint 2023, arXiv:2304.13785. [Google Scholar]
- Lin, Y.; Ma, X.; Chu, X.; Jin, Y.; Yang, Z.; Wang, Y.; Mei, H. Lora dropout as a sparsity regularizer for overfitting control. arXiv preprint 2024, arXiv:2404.09610. [Google Scholar]
- Suri, K.; Mishra, P.; Saha, S.; Singh, A. SuryaKiran at MEDIQA-Sum 2023: Leveraging LoRA for Clinical Dialogue Summarization. In Proceedings of the Working Notes of the Conference and Labs of the Evaluation Forum; 2023; pp. 1720–1735. [Google Scholar]
- Zhang, M.; Chen, H.; Shen, C.; Yang, Z.; Ou, L.; Yu, X.; Zhuang, B. Loraprune: Pruning meets low-rank parameter-efficient fine-tuning. arXiv preprint 2023, arXiv:2305.18403. [Google Scholar]
- Jin, F.; Liu, Y.; Tan, Y. Derivative-Free Optimization for Low-Rank Adaptation in Large Language Models. arXiv preprint 2024, arXiv:2403.01754. [Google Scholar] [CrossRef]
- Woo, S.; Park, B.; Kim, B.; Jo, M.; Kwon, S.; Jeon, D.; Lee, D. DropBP: Accelerating Fine-Tuning of Large Language Models by Dropping Backward Propagation. arXiv preprint 2024, arXiv:2402.17812. [Google Scholar]
- Na, S.; Guo, Y.; Jiang, F.; Ma, H.; Huang, J. Segment Any Cell: A SAM-based Auto-prompting Fine-tuning Framework for Nuclei Segmentation. arXiv preprint 2024, arXiv:2401.13220. [Google Scholar]
- Feng, W.; Zhu, L.; Yu, L. Cheap Lunch for Medical Image Segmentation by Fine-tuning SAM on Few Exemplars. arXiv preprint 2023, arXiv:2308.14133. [Google Scholar]
- Yu, K.; Liu, J.; Feng, M.; Cui, M.; Xie, X. Boosting3D: High-Fidelity Image-to-3D by Boosting 2D Diffusion Prior to 3D Prior with Progressive Learning. arXiv preprint 2023, arXiv:2311.13617. [Google Scholar]
- Liu, Z.; Kundu, S.; Li, A.; Wan, J.; Jiang, L.; Beerel, P.A. AFLoRA: Adaptive Freezing of Low Rank Adaptation in Parameter Efficient Fine-Tuning of Large Models. arXiv preprint 2024, arXiv:2403.13269. [Google Scholar]
- Zaken, E.B.; Goldberg, Y.; Ravfogel, S. BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-models. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2022; pp. 1–9.
- Tran, H.; Yang, Z.; Yao, Z.; Yu, H. BioInstruct: Instruction Tuning of Large Language Models for Biomedical Natural Language Processing. arXiv preprint 2023, arXiv:2310.19975. [Google Scholar] [CrossRef]
- Valipour, M.; Rezagholizadeh, M.; Kobyzev, I.; Ghodsi, A. Dylora: Parameter efficient tuning of pre-trained models using dynamic search-free low-rank adaptation. arXiv preprint 2022, arXiv:2210.07558. [Google Scholar]
- Zhang, F.; Pilanci, M. Riemannian Preconditioned LoRA for Fine-Tuning Foundation Models. arXiv preprint 2024, arXiv:2402.02347. [Google Scholar]
- Li, J.; Lei, Y.; Bian, Y.; Cheng, D.; Ding, Z.; Jiang, C. RA-CFGPT: Chinese financial assistant with retrieval-augmented large language model. Frontiers of Computer Science 2024, 18, 185350. [Google Scholar] [CrossRef]
- Finn, C.; Abbeel, P.; Levine, S. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. In Proceedings of the 34th International Conference on Machine Learning, 2017; pp. 1126–1135.
- Zhang, H. SinkLoRA: Enhanced Efficiency and Chat Capabilities for Long-Context Large Language Models. arXiv preprint, 2023; arXiv:2406.05678. [Google Scholar]
- Chen, Y.; Qian, S.; Tang, H.; Lai, X.; Liu, Z.; Han, S.; Jia, J. LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models. arXiv preprint, 2023; arXiv:2309.12307. [Google Scholar]
- Liu, Y.; An, C.; Qiu, X. Y-tuning: An efficient tuning paradigm for large-scale pre-trained models via label representation learning. Frontiers of Computer Science 2024, 18, 184320. [Google Scholar] [CrossRef]
- Wu, T.; Wang, J.; Zhao, Z.; Wong, N. Mixture-of-Subspaces in Low-Rank Adaptation. arXiv preprint 2024, arXiv:2406.11909. [Google Scholar]
- Dong, Q.; Li, L.; Dai, D.; Zheng, C.; Wu, Z.; Chang, B.; Sun, X.; Xu, J.; Li, L.; Sui, Z. A Survey for In-context Learning. arXiv preprint, 2023; arXiv:2301.00234. [Google Scholar]
- Li, S. DiffStyler: Diffusion-based Localized Image Style Transfer. arXiv preprint 2024, arXiv:2403.18461. [Google Scholar]
- Lester, B.; Al-Rfou, R.; Constant, N. The Power of Scale for Parameter-Efficient Prompt Tuning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021; pp. 3045–3059.
- Jang, U.; Lee, J.D.; Ryu, E.K. LoRA Training in the NTK Regime has No Spurious Local Minima. arXiv preprint, 2024; arXiv:2402.11867. [Google Scholar]
- Chen, X.; Wang, C.; Ning, H.; Li, S. SAM-OCTA: Prompting Segment-Anything for OCTA Image Segmentation. arXiv preprint 2023, arXiv:2310.07183. [Google Scholar] [CrossRef]
- Bałazy, K.; Banaei, M.; Aberer, K.; Tabor, J. LoRA-XS: Low-Rank Adaptation with Extremely Small Number of Parameters. arXiv preprint 2024, arXiv:2405.17604. [Google Scholar]
- Ding, N.; Lv, X.; Wang, Q.; Chen, Y.; Zhou, B.; Liu, Z.; Sun, M. Sparse Low-rank Adaptation of Pre-trained Language Models. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023; pp. 4133–4145.
- Zhu, Y.; Yang, X.; Wu, Y.; Zhang, W. Parameter-efficient fine-tuning with layer pruning on free-text sequence-to-sequence modeling. arXiv preprint 2023, arXiv:2305.08285. [Google Scholar]
- Zi, B.; Qi, X.; Wang, L.; Wang, J.; Wong, K.; Zhang, L. Delta-LoRA: Fine-Tuning High-Rank Parameters with the Delta of Low-Rank Matrices. arXiv preprint, 2023; arXiv:2309.02411. [Google Scholar]
- Zhu, Y.; Wichers, N.; Lin, C.; Wang, X.; Chen, T.; Shu, L.; Lu, H.; Liu, C.; Luo, L.; Chen, J.; et al. SiRA: Sparse Mixture of Low Rank Adaptation. arXiv preprint, 2023; arXiv:2311.09179. [Google Scholar]
- Kopiczko, D.J.; Blankevoort, T.; Asano, Y.M. Vera: Vector-based random matrix adaptation. arXiv preprint 2023, arXiv:2310.11454. [Google Scholar]
- Wen, Y.; Chaudhuri, S. Batched Low-Rank Adaptation of Foundation Models. arXiv preprint, 2023; arXiv:2312.05677. [Google Scholar]
- Zhang, S.; Chen, Z.; Chen, S.; Shen, Y.; Sun, Z.; Gan, C. Improving Reinforcement Learning from Human Feedback with Efficient Reward Model Ensemble. arXiv preprint 2024, arXiv:2401.16635. [Google Scholar]
- Hendrycks, D.; Burns, C.; Basart, S.; Zou, A.; Mazeika, M.; Song, D.; Steinhardt, J. Measuring massive multitask language understanding. arXiv preprint 2020, arXiv:2009.03300. [Google Scholar]
- Liao, B.; Monz, C. ApiQ: Finetuning of 2-Bit Quantized Large Language Model. arXiv preprint 2024, arXiv:2402.05147. [Google Scholar]
- Smith, J.S.; Cascante-Bonilla, P.; Arbelle, A.; Kim, D.; Panda, R.; Cox, D.D.; Yang, D.; Kira, Z.; Feris, R.; Karlinsky, L. ConStruct-VL: Data-Free Continual Structured VL Concepts Learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2023; pp. 14994–15004. [Google Scholar]
- Huang, T.; Zeng, Y.; Zhang, Z.; Xu, W.; Xu, H.; Xu, S.; Lau, R.W.H.; Zuo, W. DreamControl: Control-Based Text-to-3D Generation with 3D Self-Prior. arXiv preprint 2023, arXiv:2312.06439. [Google Scholar]
- Liu, Z.; Li, S.; Luo, Y.; Fei, H.; Cao, Y.; Kawaguchi, K.; Wang, X.; Chua, T. MolCA: Molecular Graph-Language Modeling with Cross-Modal Projector and Uni-Modal Adapter. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023; pp. 15623–15638.
- Zhong, W.; Cui, R.; Guo, Y.; Liang, Y.; Lu, S.; Wang, Y.; Saied, A.; Chen, W.; Duan, N. Agieval: A human-centric benchmark for evaluating foundation models. arXiv preprint 2023, arXiv:2304.06364. [Google Scholar]
- Wang, A.; Singh, A.; Michael, J.; Hill, F.; Levy, O.; Bowman, S.R. GLUE: A multi-task benchmark and analysis platform for natural language understanding. arXiv preprint 2018, arXiv:1804.07461. [Google Scholar]
- Han, Z.; Gao, C.; Liu, J.; Zhang, S.Q.; et al. Parameter-efficient fine-tuning for large models: A comprehensive survey. arXiv preprint 2024, arXiv:2403.14608. [Google Scholar]
- Blattmann, A.; Dockhorn, T.; Kulal, S.; Mendelevitch, D.; Kilian, M.; Lorenz, D.; Levi, Y.; English, Z.; Voleti, V.; Letts, A.; et al. Stable video diffusion: Scaling latent video diffusion models to large datasets. arXiv preprint 2023, arXiv:2311.15127. [Google Scholar]
- Wang, S.; Chen, L.; Jiang, J.; Xue, B.; Kong, L.; Wu, C. LoRA Meets Dropout under a Unified Framework. arXiv preprint 2024, arXiv:2403.00812. [Google Scholar]
- Gal, Y.; Ghahramani, Z. Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning. In Proceedings of the 33nd International Conference on Machine Learning, 2016; pp. 1050–1059.
- Zhang, J.; Chen, S.; Liu, J.; He, J. Composing Parameter-Efficient Modules with Arithmetic Operations. arXiv preprint, 2023; arXiv:2306.14870. [Google Scholar]
- Zhang, Y.; Wang, J.; Yu, L.; Xu, D.; Zhang, X. Personalized LoRA for Human-Centered Text Understanding. In Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence; 2024; pp. 19588–19596. [Google Scholar]
- Qi, Z.; Tan, X.; Shi, S.; Qu, C.; Xu, Y.; Qi, Y. PILLOW: Enhancing Efficient Instruction Fine-tuning via Prompt Matching. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track, 2023; pp. 471–482.
- Yang, Y.; Xiong, S.; Payani, A.; Shareghi, E.; Fekri, F. Harnessing the Power of Large Language Models for Natural Language to First-Order Logic Translation. arXiv preprint 2023, arXiv:2305.15541. [Google Scholar]
- Bai, J.; Chen, D.; Qian, B.; Yao, L.; Li, Y. Federated Fine-tuning of Large Language Models under Heterogeneous Language Tasks and Client Resources. arXiv preprint 2024, arXiv:2402.11505. [Google Scholar]
- Ye, M.; Fang, X.; Du, B.; Yuen, P.C.; Tao, D. Heterogeneous Federated Learning: State-of-the-art and Research Challenges. ACM Computing Surveys 2024, 56, 79:1–79:44. [Google Scholar] [CrossRef]
- Zhang, Q.; Chen, M.; Bukharin, A.; He, P.; Cheng, Y.; Chen, W.; Zhao, T. Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning. In Proceedings of the The Eleventh International Conference on Learning Representations; 2023. [Google Scholar]
- Agiza A, Neseem M, R.S. MTLoRA: Low-Rank Adaptation Approach for Efficient Multi-Task Learning. In Proceedings of the CVPR; 2024.
- Ye, Z.; Lovell, L.; Faramarzi, A.; Ninic, J. SAM-based instance segmentation models for the automation of structural damage detection. arXiv preprint 2024, arXiv:2401.15266. [Google Scholar] [CrossRef]
- Wang, Y.; Lin, Y.; Zeng, X.; Zhang, G. PrivateLoRA For Efficient Privacy Preserving LLM. arXiv preprint 2023, arXiv:2311.14030. [Google Scholar]
- Qin, H.; Ma, X.; Zheng, X.; Li, X.; Zhang, Y.; Liu, S.; Luo, J.; Liu, X.; Magno, M. Accurate LoRA-Finetuning Quantization of LLMs via Information Retention. arXiv preprint 2024, arXiv:2402.05445. [Google Scholar]
- Zhou, H.; Lu, X.; Xu, W.; Zhu, C.; Zhao, T. LoRA-drop: Efficient LoRA Parameter Pruning based on Output Evaluation. arXiv preprint 2024, arXiv:2402.07721. [Google Scholar]
- Ding, N.; Qin, Y.; Yang, G.; Wei, F.; Yang, Z.; Su, Y.; Hu, S.; Chen, Y.; Chan, C.; Chen, W.; et al. Parameter-efficient fine-tuning of large-scale pre-trained language models. Nat. Mac. Intell. 2023, 5, 220–235. [Google Scholar] [CrossRef]
- Zhao, H.; Ni, B.; Wang, H.; Fan, J.; Zhu, F.; Wang, Y.; Chen, Y.; Meng, G.; Zhang, Z. Continual Forgetting for Pre-trained Vision Models. arXiv preprint, 2024; arXiv:2403.11530. [Google Scholar]
- Han, A.; Li, J.; Huang, W.; Hong, M.; Takeda, A.; Jawanpuria, P.; Mishra, B. SLTrain: a sparse plus low-rank approach for parameter and memory efficient pretraining. arXiv preprint 2024, arXiv:2406.02214. [Google Scholar]
- Mao, Y.; Huang, K.; Guan, C.; Bao, G.; Mo, F.; Xu, J. DoRA: Enhancing Parameter-Efficient Fine-Tuning with Dynamic Rank Distribution. arXiv preprint 2024, arXiv:2405.17357. [Google Scholar]
- Devlin, J.; Chang, M.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT; 2019; pp. 4171–4186. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).