Submitted:
29 July 2023
Posted:
01 August 2023
Read the latest preprint version here
Abstract
Keywords:
1. Introduction
- Hypothesis 1. Research Objective: Research Objective: The research aims to assess the audience’s response to Llama 2 and verify Meta’s expectations that an open-source model will experience faster development compared to closed-source models [2].
- Hypothesis 2. Research Objective: The research aims to assess the challenges encountered by early adopters in deploying the Llama 2 model.
- Hypothesis 3. Research Objective: The research aims to assess the challenges encountered by early adopters in fine-tuning the Llama 2 model.
- Hypothesis 4. Research Objective: The research seeks to unveil that the medical domain consistently ranks among the primary domains that early adopters engage with, undertaking fine-tuning of models.
2. Background and Context
2.1. Evolution of Language Models
2.2. Natural Language Processing (NLP)
2.3. Transformer Architecture
2.4. Supervised fine-tuning
3. Llama 2 Models and Licensing
3.1. Accessibility and Licensing
3.2. Llama 2 Models and Versions
4. Training Process
4.1. Pretraining Data
4.2. Llama 2 Fine-tuning
4.3. Llama 2 Eco-consciousness
- Llama 2 7B: 184,320 GPU hours, 400W power consumption, and 31.22 tCO2eq carbon emissions.
- Llama 2 13B: 368,640 GPU hours, 400W power consumption, and 62.44 tCO2eq carbon emissions.
- Llama 2 70B: 1,720,320 GPU hours, 400W power consumption, and 291.42 tCO2eq carbon emissions.
5. Llama 2: Early Adopters’ Case Studies and Projects
5.1. Official Llama2 Recipes Repository
5.2. Llama2.c by @karpathykarpathy
5.3. Llama2-Chinese by @FlagAlpha
5.4. Llama2-chatbot by @a16z-infra
5.5. Llama2-webui by @liltom-eth
5.6. Llama-2-Open-Source-LLM-CPU-Inference by @kennethleungty
5.7. Docker-llama2-chat by @soulteary
5.8. Llama2 by @dataprofessor
5.9. Llama-2-jax by @ayaka14732
5.10. LLaMA2-Accessory by @Alpha-VLLM
5.11. Llama2-Medical-Chatbot by @AIAnytime
5.12. Llama2-haystack by @anakin87
6. Results
- Hypothesis 1
- Null Hypothesis (H0): There is no significant difference in the audience’s response to Llama 2 between the open-source model and closed models.
- Alternative Hypothesis (H1): There is a significant difference in the audience’s response to Llama 2, with the open-source model experiencing faster development compared to closed models, as expected by Meta.
- Hypothesis 2
- Null Hypothesis (H0): There is no significant difference in the challenges encountered by early adopters in deploying the Llama 2 model.
- Alternative Hypothesis (H1): There is a significant difference in the challenges encountered by early adopters in deploying the Llama 2 model.
- Hypothesis 3
- Null Hypothesis (H0): There is no significant difference in the challenges encountered by early adopters in fine-tuning the Llama 2 model.
- Alternative Hypothesis (H1): Early adopters encounter significant challenges in fine-tuning the Llama 2 model.
- Hypothesis 4
- Null Hypothesis (H0): There is no significant difference in the interest shown by early adopters of Llama 2 between the medical domain and other domains.
- Alternative Hypothesis (H1): Early adopters of Llama 2 prioritize the medical domain significantly more than other domains, indicating a greater interest in utilizing LLMs for medical applications.
7. Responsible AI and Ethical Considerations
- Bias Mitigation and Fairness: Early adopters’ experiences with Llama 2 have highlighted the importance of addressing biases in AI outputs. As a pre-trained model trained on diverse data sources, Llama 2 may inadvertently inherit biases present in the training data. Researchers and developers must implement robust techniques to identify and mitigate biases to ensure fairness and equitable outcomes across diverse user populations [56,57,58].
- Transparency and Interpretability: The complexity of deep learning models like Llama 2 can present challenges in understanding their decision-making processes. To promote transparency and interpretability, early adopters have emphasized the need for methods that provide insights into the model’s internal workings. Future research should focus on developing techniques to make AI models more interpretable, enabling users to comprehend the rationale behind model’s predictions [59].
- Privacy and Data Protection: Llama 2’s success heavily relies on the vast amount of data used during pretraining. Early adopters recognize the significance of safeguarding user data and respecting privacy concerns. Employing privacy-preserving methods, such as federated learning or differential privacy, can uphold the confidentiality of user data while ensuring the model’s effectiveness [60,61].
- Ethical Use-Cases and Societal Impact: As AI technologies like Llama 2 become increasingly integrated into various domains, early adopters have stressed the importance of identifying and promoting ethically sound use cases. Research should extend to analyze the societal impact of Llama 2’s deployment, considering potential consequences on individuals, communities, and societal values. Striking a balance between innovation and responsible AI practices is crucial to harness the full potential of LLMs while mitigating unintended negative effects [62,63].
- Continuous Monitoring and Auditing: To maintain ethical AI practices, early adopters advocate for continuous monitoring and auditing of Llama 2’s performance. Regular assessments can help identify potential biases or deviations in the model’s behavior, enabling timely adjustments to ensure compliance with ethical standards [64,65].
- End-User Empowerment and Informed Consent: As AI models like Llama 2 become integral to user experiences, early adopters have emphasized the significance of end-user empowerment and informed consent. Users should be well-informed about the AI’s involvement in their interactions and have the right to control and modify the extent of AI-driven recommendations or decisions [66].
8. Future Directions and Research
9. Discussion
- The investigation into application diversity and effectiveness of Llama 2 indicates a notable level of interest among early adopters across a broad spectrum of AI projects. These adopters have demonstrated successful deployment of the model on multiple platforms and technologies, particularly when fine-tuned for domain-specific tasks. This observation underscores the model’s versatility and effectiveness in addressing various AI tasks, making it a potential solution of interest for researchers and developers seeking a unified model suitable for multiple applications.
- Early Adopters’ Feedback and Challenges: Despite the limited availability of feedback, early adopters reported encountering minimal challenges in both deployment and fine-tuning processes of Llama 2. This outcome reflects favorably on Meta for synchronously launching the model with model recipes [40], which seemingly contributed to a smooth user experience and implementation for the adopters.
- Cross-Model Comparisons: In order to obtain a comprehensive assessment of Llama 2’s standing within the AI landscape, it is suggested that future studies conduct comparative analyses with other prominent pretrained models. By undertaking cross-model comparisons, researchers can glean valuable insights into Llama 2’s distinctive contributions, advantages, and areas in which it outperforms existing alternatives. Such analyses would aid in elucidating the specific strengths and capabilities of Llama 2, contributing to a more holistic understanding of its potential in the field of artificial intelligence.
- Extended Use Cases and Domains: While the present study provides insights into early adopters’ deployment of Llama 2 in specific applications, future research endeavors could extend the investigation to encompass its implementation across additional domains and diverse use cases. Exploring Llama 2’s potential in emerging fields, such as healthcare, finance, and environmental sciences, would not only exemplify its versatility but also widen its potential impact across various industries and research domains.
10. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Touvron, H.; Martin, L.; Stone, K.; Albert, P.; Almahairi, A.; Babaei, Y.; Bashlykov, N.; Batra, S.; Bhargava, P.; Bhosale, S.; Bikel, D.; Blecher, L.; Ferrer, C. C.; Chen, M.; Cucurull, G.; Esiobu, D.; Fernandes, J.; Fu, J.; Fu, W.; Fuller, B.; Gao, C.; Goswami, V.; Goyal, N.; Hartshorn, A.; Hosseini, S.; Hou, R.; Inan, H.; Kardas, M.; Kerkez, V.; Khabsa, M.; Kloumann, I.; Korenev, A.; Koura, P. S.; Lachaux, M.-A.; Lavril, T.; Lee, J.; Liskovich, D.; Lu, Y.; Mao, Y.; Martinet, X.; Mihaylov, T.; Mishra, P.; Molybog, I.; Nie, Y.; Poulton, A.; Reizenstein, J.; Rungta, R.; Saladi, K.; Schelten, A.; Silva, R.; Smith, E. M.; Subramanian, R.; Tan, X. E.; Tang, B.; Taylor, R.; Williams, A.; Kuan, J. X.; Xu, P.; Yan, Z.; Zarov, I.; Zhang, Y.; Fan, A.; Kambadur, M.; Narang, S.; Rodriguez, A.; Stojnic, R.; Edunov, S.; Scialom, T. Llama 2: Open Foundation and fine-tuned chat models. arXiv 2023, arXiv:2307.09288. [Google Scholar]
- Meta and Microsoft introduce the next generation of Llama. Available online: https://ai.meta.com/blog/llama-2/ (accessed on 28 July 2023).
- Roumeliotis, K.I.; Tselikas, N.D. ChatGPT and Open-AI Models: A Preliminary Review. Futur. Internet 2023, 15, 192. [Google Scholar] [CrossRef]
- Dillion, D.; Tandon, N.; Gu, Y.; Gray, K. Can ai language models replace human participants? Trends in Cognitive Sciences 2023, 27, 597–600. [Google Scholar] [CrossRef] [PubMed]
- Rahali, A.; Akhloufi, M.A. End-to-End Transformer-Based Models in Textual-Based NLP. AI 2023, 4, 54–110. [Google Scholar] [CrossRef]
- Piris, Y.; Gay, A.-C. Customer satisfaction and natural language processing. J. Bus. Res. 2021, 124, 264–271. [Google Scholar] [CrossRef]
- Dash, G.; Sharma, C.; Sharma, S. Sustainable Marketing and the Role of Social Media: An Experimental Study Using Natural Language Processing (NLP). Sustainability 2023, 15, 5443. [Google Scholar] [CrossRef]
- Arowosegbe, A.; Oyelade, T. Application of Natural Language Processing (NLP) in Detecting and Preventing Suicide Ideation: A Systematic Review. Int. J. Environ. Res. Public Health 2023, 20, 1514. [Google Scholar] [CrossRef]
- Tyagi, N.; Bhushan, B. Demystifying the Role of Natural Language Processing (NLP) in Smart City Applications: Background, Motivation, Recent Advances, and Future Research Directions. Wirel. Pers. Commun. 2023, 130, 857–908. [Google Scholar] [CrossRef]
- Tyagi, N.; Bhushan, B. Demystifying the Role of Natural Language Processing (NLP) in Smart City Applications: Background, Motivation, Recent Advances, and Future Research Directions. Wirel. Pers. Commun. 2023, 130, 857–908. [Google Scholar] [CrossRef]
- Pruneski, J.A.; Pareek, A.; Nwachukwu, B.U.; Martin, R.K.; Kelly, B.T.; Karlsson, J.; Pearle, A.D.; Kiapour, A.M.; Williams, R.J. Natural language processing: using artificial intelligence to understand human language in orthopedics. Knee Surgery, Sports Traumatol. Arthrosc. 2022, 31, 1203–1211. [Google Scholar] [CrossRef]
- Mukhamadiyev, A.; Mukhiddinov, M.; Khujayarov, I.; Ochilov, M.; Cho, J. Development of Language Models for Continuous Uzbek Speech Recognition System. Sensors 2023, 23, 1145. [Google Scholar] [CrossRef] [PubMed]
- Ahmed, A.; Leroy, G.; Lu, H.Y.; Kauchak, D.; Stone, J.; Harber, P.; Rains, S.A.; Mishra, P.; Chitroda, B. Audio delivery of health information: An NLP study of information difficulty and bias in listeners. Procedia Comput. Sci. 2023, 219, 1509–1517. [Google Scholar] [CrossRef] [PubMed]
- Wang, J.; Xu, G.; Yan, F.; Wang, J.; Wang, Z. Defect transformer: An efficient hybrid transformer architecture for surface defect detection. Measurement 2023, 211. [Google Scholar] [CrossRef]
- Drosouli, I.; Voulodimos, A.; Mastorocostas, P.; Miaoulis, G.; Ghazanfarpour, D. TMD-BERT: A Transformer-Based Model for Transportation Mode Detection. Electronics 2023, 12, 581. [Google Scholar] [CrossRef]
- Philippi, D.; Rothaus, K.; Castelli, M. A vision transformer architecture for the automated segmentation of retinal lesions in spectral domain optical coherence tomography images. Sci. Rep. 2023, 13, 1–14. [Google Scholar] [CrossRef] [PubMed]
- Aleissaee, A.A.; Kumar, A.; Anwer, R.M.; Khan, S.; Cholakkal, H.; Xia, G.-S.; Khan, F.S. Transformers in Remote Sensing: A Survey. Remote. Sens. 2023, 15, 1860. [Google Scholar] [CrossRef]
- Panopoulos, I.; Nikolaidis, S.; Venieris, S.I.; Venieris, I.S. Exploring the performance and efficiency of Transformer models for NLP on mobile devices. arXiv 2023, arXiv:2306.11426. [Google Scholar]
- Ohri, K.; Kumar, M. Supervised fine-tuned approach for automated detection of diabetic retinopathy. Multimedia Tools Appl. 2023, 1–22. [Google Scholar] [CrossRef]
- Li, H.; Zhu, C.; Zhang, Y.; Sun, Y.; Shui, Z.; Kuang, W.; Zheng, S.; Yang, L. Task-specific fine-tuning via variational information bottleneck for weakly-supervised pathology whole slide image classification. arXiv 2023, arXiv:2303.08446. [Google Scholar]
- Lodagala, V.S.; Ghosh, S.; Umesh, S. Pada: Pruning assisted domain adaptation for self-supervised speech representations. 2022 IEEE Spoken Language Technology Workshop (SLT) 2023. [CrossRef]
- Han, X.; Zhang, Z.; Ding, N.; Gu, Y.; Liu, X.; Huo, Y.; Qiu, J.; Yao, Y.; Zhang, A.; Zhang, L.; et al. Pre-trained models: Past, present and future. AI Open 2021, 2, 225–250. [Google Scholar] [CrossRef]
- Prottasha, N.J.; Sami, A.A.; Kowsher, M.; Murad, S.A.; Bairagi, A.K.; Masud, M.; Baz, M. Transfer Learning for Sentiment Analysis Using BERT Based Supervised Fine-Tuning. Sensors 2022, 22, 4157. [Google Scholar] [CrossRef]
- Xu, Z.; Huang, S.; Zhang, Y.; Tao, D. Webly-Supervised Fine-Grained Visual Categorization via Deep Domain Adaptation. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 40, 1100–1113. [Google Scholar] [CrossRef]
- Tang, C. I.; Qendro, L.; Spathis, D.; Kawsar, F.; Mascolo, C.; Mathur, A. Practical self-supervised continual learning with continual fine-tuning. arXiv 2023, arXiv:2303.17235. [Google Scholar]
- Skelton, J. Llama 2. A model overview and demo tutorial with Paperspace Gradient. Available online: https://blog.paperspace.com/llama-2/ (accessed on 28 July 2023).
- Hugging Face llama-2-7b. Available online: https://huggingface.co/meta-llama/Llama-2-7b (accessed on Jul 28, 2023).
- Llama 2 - Resource Overview - META AI. Available online: https://ai.meta.com/resources/models-and-libraries/llama/ (accessed on 28 July 2023).
- Llama 2 - Responsible Use Guide. Available online: https://ai.meta.com/llama/responsible-use-guide/ (accessed on 28 July 2023).
- Llama 2 License Agreement. Available online: https://github.com/facebookresearch/llama/blob/main/LICENSE (accessed on 28 July 2023).
- Inference code for Llama Models - GitHub. Available online: https://github.com/facebookresearch/llama/tree/main (accessed on 28 July 2023).
- Hugging Face Llama 2 Models. Available online: https://huggingface.co/models?other=llama-2 (accessed on 28 July 2023).
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A. N.; Kaiser, L.; Polosukhin, I. Attention is all you need. arXiv 2023, arXiv:1706.03762. [Google Scholar]
- Sennrich, R.; Haddow, B.; Birch, A. Neural machine translation of rare words with subword units. arXiv 2016, arXiv:1508.07909. [Google Scholar]
- Shazeer, N. Glu variants improve transformer. arXiv 2020, arXiv:2002.05202. [Google Scholar]
- Song, F.; Yu, B.; Li, M.; Yu, H.; Huang, F.; Li, Y.; Wang, H. Preference ranking optimization for human alignment. arXiv 2023, arXiv:2306.17492. [Google Scholar]
- Taecharungroj, V. “What Can ChatGPT Do?” Analyzing Early Reactions to the Innovative AI Chatbot on Twitter. Big Data Cogn. Comput. 2023, 7, 35. [Google Scholar] [CrossRef]
- Sotnikov, V.; Chaikova, A. Language Models for Multimessenger Astronomy. Galaxies 2023, 11, 63. [Google Scholar] [CrossRef]
- Maroto-Gómez, M.; Castro-González, Á.; Castillo, J.C.; Malfaz, M.; Salichs, M. An adaptive decision-making system supported on user preference predictions for human–robot interactive communication. User Model. User-Adapted Interact. 2022, 33, 359–403. [Google Scholar] [CrossRef]
- Facebookresearch/llama-recipes: Examples and recipes for Llama 2 model. Available online: https://github.com/facebookresearch/llama-recipes (accessed on 28 July 2023).
- Karpathy/LLAMA2.C: Inference llama 2 in one file of pure C. Available online: https://github.com/karpathy/llama2.c (accessed on 28 July 2023).
- Flagalpha/LLAMA2-Chinese. Available online: https://github.com/FlagAlpha/Llama2-Chinese (accessed on 28 July 2023).
- A16Z-infra/LLAMA2-chatbot. Available online: https://github.com/a16z-infra/llama2-chatbot (accessed on 28 July 2023).
- Liltom-Eth Liltom-eth/LLAMA2-webui. Available online: https://github.com/liltom-eth/llama2-webui (accessed on 28 July 2023).
- Kennethleungty/llama-2-open-source-llm-cpu-inference. Available online: https://github.com/kennethleungty/Llama-2-Open-Source-LLM-CPU-Inference (accessed on 28 July 2023).
- Soulteary Soulteary/docker-LLAMA2-chat. Available online: https://github.com/soulteary/docker-llama2-chat (accessed on 28 July 2023).
- Dataprofessor/Llama2. Available online: https://github.com/dataprofessor/llama2 (accessed on 28 July 2023).
- AYAKA14732/llama-2-jax. Available online: https://github.com/ayaka14732/llama-2-jax (accessed on 28 July 2023).
- Alpha-VLLM/LLAMA2-accessory. Available online: https://github.com/Alpha-VLLM/LLaMA2-Accessory (accessed on 28 July 2023).
- AIANYTIME/LLAMA2-Medical-chatbot. Available online: https://github.com/AIAnytime/Llama2-Medical-Chatbot (accessed on 28 July 2023).
- Anakin87/LLAMA2-Haystack. Available online: https://github.com/anakin87/llama2-haystack (accessed on 28 July 2023).
- Anakin87/LLAMA2-Haystack. Available online: https://github.com/anakin87/llama2-haystack (accessed on 28 July 2023).
- Bender, E. M.; Gebru, T.; McMillan-Major, A.; Shmitchell, S. On the dangers of stochastic parrots. Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency 2021.
- Weidinger, L.; Mellor, J.; Rauh, M.; Griffin, C.; Uesato, J.; Huang, P.-S.; Cheng, M.; Glaese, M.; Balle, B.; Kasirzadeh, A.; Kenton, Z.; Brown, S.; Hawkins, W.; Stepleton, T.; Biles, C.; Birhane, A.; Haas, J.; Rimell, L.; Hendricks, L. A.; Isaac, W.; Legassick, S.; Irving, G.; Gabriel, I. Ethical and social risks of harm from language models. arXiv 2021, arXiv:2112.04359. [Google Scholar]
- Solaiman, I.; Talat, Z.; Agnew, W.; Ahmad, L.; Baker, D.; Blodgett, S. L.; Daumé III, H.; Dodge, J.; Evans, E.; Hooker, S.; Jernite, Y.; Luccioni, A. S.; Lusoli, A.; Mitchell, M.; Newman, J.; Png, M.-T.; Strait, A.; Vassilev, A. Evaluating the social impact of Generative AI systems in Systems and Society. arXiv 2023, arXiv:2306.05949. [Google Scholar]
- Li, Y.; Zhang, Y. Fairness of chatgpt. arXiv 2023, arXiv:2305.18569. [Google Scholar]
- Abramski, K.; Citraro, S.; Lombardi, L.; Rossetti, G.; Stella, M. Cognitive Network Science Reveals Bias in GPT-3, GPT-3.5 Turbo, and GPT-4 Mirroring Math Anxiety in High-School Students. Big Data Cogn. Comput. 2023, 7, 124. [Google Scholar] [CrossRef]
- Rozado, D. The Political Biases of ChatGPT. Soc. Sci. 2023, 12, 148. [Google Scholar] [CrossRef]
- Carvalho, D.V.; Pereira, E.M.; Cardoso, J.S. Machine Learning Interpretability: A Survey on Methods and Metrics. Electronics 2019, 8, 832. [Google Scholar] [CrossRef]
- Mazurek, G.; Małagocka, K. Perception of privacy and data protection in the context of the development of artificial intelligence. J. Manag. Anal. 2019, 6, 344–364. [Google Scholar] [CrossRef]
- Goldsteen, A.; Saadi, O.; Shmelkin, R.; Shachor, S.; Razinkov, N. Ai Privacy Toolkit. SoftwareX 2023, 22, 101352. [Google Scholar] [CrossRef]
- Hagerty, A.; Rubinov, I. Global AI Ethics: A review of the social impacts and ethical implications of Artificial Intelligence. arXiv 2019, arXiv:1907.07892. [Google Scholar]
- Khakurel, J.; Penzenstadler, B.; Porras, J.; Knutas, A.; Zhang, W. The Rise of Artificial Intelligence under the Lens of Sustainability. Technologies 2018, 6, 100. [Google Scholar] [CrossRef]
- Minkkinen, M.; Laine, J.; Mäntymäki, M. Continuous auditing of Artificial Intelligence: A conceptualization and assessment of tools and Frameworks. Digital Society 2022, 1. [Google Scholar] [CrossRef]
- Mökander, J.; Floridi, L. Ethics-Based Auditing to Develop Trustworthy AI. Minds Mach. 2021, 31, 323–327. [Google Scholar] [CrossRef]
- Usmani, U.A.; Happonen, A.; Watada, J. Human-centered artificial intelligence: Designing for user empowerment and ethical considerations. 2023 5th International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA) 2023. [CrossRef]
- Zeng, C.; Li, S.; Li, Q.; Hu, J.; Hu, J. A Survey on Machine Reading Comprehension—Tasks, Evaluation Metrics and Benchmark Datasets. Appl. Sci. 2020, 10, 7640. [Google Scholar] [CrossRef]
- Eleftheriadis, P.; Perikos, I.; Hatzilygeroudis, I. Evaluating Deep Learning Techniques for Natural Language Inference. Appl. Sci. 2023, 13, 2577. [Google Scholar] [CrossRef]
| Llama 2: Early Adopters' Projects | Areas of Focus |
|---|---|
| Recipes Repository [40] | fine-tuning |
| Llama2.c by [41] | model deployment / PyTorch |
| Llama2-Chinese [42] | fine-tuning / language / Chinese |
| Llama2-chatbot [43] | chatbot |
| Llama2-webui [44] | model deployment / model optimization / CPU / GPU |
| Llama-2-Open-Source-LLM-CPU-Inference [45] | model deployment / model optimization / CPU |
| Docker-llama2-chat [46] | chatbot / model deployment / docker / CPU |
| Llama2 [47] | model deployment / chatbot |
| Llama-2-jax [48] | model deployment / JAX / PyTorch / Google Cloud TPUs |
| LLaMA2-Accessory [49] | model deployment / fine-tuning |
| Llama2-Medical-Chatbot [50] | chatbot / medical / fine-tuning / model deployment / CPU |
| Llama2-haystack [51] | model deployment / Haystack |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).