Submitted:
05 July 2024
Posted:
09 July 2024
You are already at the latest version
Abstract
Keywords:
Introduction
Literature Review
Literature Review on Large Language Models
The Role of Objectives and Reward Functions in AI
Learning through Trial and Error
The Role of Games in Cognitive Development
Training Large Language Models (LLMs)
The Necessity of Objectives and Goals in AI Reasoning
Challenges in Mathematical Reasoning
Simulated Environments versus Real-World Interaction
Towards Artificial General Intelligence (AGI)
The Importance of Multimodal AI in Achieving AGI
The Importance of Competing Models and Second Guessing
Conclusion
References
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
- Li, Y. Deep reinforcement learning: An overview. arXiv preprint 2017, arXiv:1701.07274 2017. [Google Scholar]
- François-Lavet, V.; Henderson, P.; Islam, R.; Bellemare, M.G.; Pineau, J.; others. An introduction to deep reinforcement learning. Foundations and Trends® in Machine Learning 2018, 11, 219–354. [Google Scholar] [CrossRef]
- Henderson, P.; Islam, R.; Bachman, P.; Pineau, J.; Precup, D.; Meger, D. Deep reinforcement learning that matters. Proceedings of the AAAI conference on artificial intelligence, 2018, Vol. 32.
- Dulac-Arnold, G.; Mankowitz, D.; Hester, T. Challenges of real-world reinforcement learning. arXiv preprint 2019, arXiv:1904.12901 2019. [Google Scholar]
- Zhao, W.; Queralta, J.P.; Westerlund, T. Sim-to-real transfer in deep reinforcement learning for robotics: a survey. 2020 IEEE symposium series on computational intelligence (SSCI). IEEE, 2020, pp. 737–744.
- Chaslot, G.; Bakkes, S.; Szita, I.; Spronck, P. Monte-carlo tree search: A new framework for game ai. Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, 2008, Vol. 4, pp. 216–217.
- Gelly, S.; Silver, D. Monte-Carlo tree search and rapid action value estimation in computer Go. Artificial Intelligence 2011, 175, 1856–1875. [Google Scholar] [CrossRef]
- Gelly, S.; Kocsis, L.; Schoenauer, M.; Sebag, M.; Silver, D.; Szepesvári, C.; Teytaud, O. The grand challenge of computer Go: Monte Carlo tree search and extensions. Communications of the ACM 2012, 55, 106–113. [Google Scholar] [CrossRef]
- Silver, D.; Huang, A.; Maddison, C.J.; Guez, A.; Sifre, L.; Van Den Driessche, G.; Schrittwieser, J.; Antonoglou, I.; Panneershelvam, V.; Lanctot, M.; others. Mastering the game of Go with deep neural networks and tree search. nature 2016, 529, 484–489. [Google Scholar] [CrossRef] [PubMed]
- Silver, D.; Schrittwieser, J.; Simonyan, K.; Antonoglou, I.; Huang, A.; Guez, A.; Hubert, T.; Baker, L.; Lai, M.; Bolton, A.; others. Mastering the game of go without human knowledge. nature 2017, 550, 354–359. [Google Scholar] [CrossRef] [PubMed]
- Anthony, T.; Tian, Z.; Barber, D. Thinking fast and slow with deep learning and tree search. Advances in neural information processing systems 2017, 30. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Advances in neural information processing systems 2017, 30.
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint 2018, arXiv:1810.04805 2018. [Google Scholar]
- Hao, Y.; Dong, L.; Wei, F.; Xu, K. Visualizing and understanding the effectiveness of BERT. arXiv preprint 2019, arXiv:1908.05620 2019. [Google Scholar]
- Koroteev, M.V. BERT: a review of applications in natural language processing and understanding. arXiv preprint, 2021; arXiv:2103.11943 2021. [Google Scholar]
- Lin, T.; Wang, Y.; Liu, X.; Qiu, X. A survey of transformers. AI open 2022, 3, 111–132. [Google Scholar] [CrossRef]
- Bengesi, S.; El-Sayed, H.; Sarker, M.K.; Houkpati, Y.; Irungu, J.; Oladunni, T. Advancements in Generative AI: A Comprehensive Review of GANs, GPT, Autoencoders, Diffusion Model, and Transformers. IEEE Access 2024. [Google Scholar] [CrossRef]
- Kasneci, E.; Seßler, K.; Küchemann, S.; Bannert, M.; Dementieva, D.; Fischer, F.; Gasser, U.; Groh, G.; Günnemann, S.; Hüllermeier, E.; others. ChatGPT for good? On opportunities and challenges of large language models for education. Learning and individual differences 2023, 103, 102274. [Google Scholar] [CrossRef]
- Chang, Y.; Wang, X.; Wang, J.; Wu, Y.; Yang, L.; Zhu, K.; Chen, H.; Yi, X.; Wang, C.; Wang, Y.; others. A survey on evaluation of large language models. ACM Transactions on Intelligent Systems and Technology 2024, 15, 1–45. [Google Scholar] [CrossRef]
- Kirchenbauer, J.; Geiping, J.; Wen, Y.; Katz, J.; Miers, I.; Goldstein, T. A watermark for large language models. International Conference on Machine Learning. PMLR, 2023, pp. 17061–17084.
- Zhao, W.X.; Zhou, K.; Li, J.; Tang, T.; Wang, X.; Hou, Y.; Min, Y.; Zhang, B.; Zhang, J.; Dong, Z.; others. A survey of large language models. arXiv preprint 2023, arXiv:2303.18223 2023. [Google Scholar]
- Xu, F.F.; Alon, U.; Neubig, G.; Hellendoorn, V.J. A systematic evaluation of large language models of code. Proceedings of the 6th ACM SIGPLAN International Symposium on Machine Programming, 2022, pp. 1–10.
- Wei, J.; Tay, Y.; Bommasani, R.; Raffel, C.; Zoph, B.; Borgeaud, S.; Yogatama, D.; Bosma, M.; Zhou, D.; Metzler, D.; others. Emergent abilities of large language models. arXiv preprint arXiv:2206.07682 2022. arXiv:2206.07682 2022.
- Wei, J.; Wang, X.; Schuurmans, D.; Bosma, M.; Xia, F.; Chi, E.; Le, Q.V.; Zhou, D.; others. Chain-of-thought prompting elicits reasoning in large language models. Advances in neural information processing systems 2022, 35, 24824–24837. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).