4.1 AI Chatbots in Education
Technological advancements in AI have modernized the education industry, fostering a dynamic learning environment with a blend of technology and global connectivity.
AI Chatbots are becoming increasingly popular and their presence can be seen in a wide range of industries. AI Chatbots built for educational sectors using NLP have been shown to provide significant benefits, such as the fact that they can act as conversational tools and can aid in answering educational-related queries through text or speech using Artificial Intelligence Markup Language (AIML) (Chempavathy et al., 2022). They can also be accessible round the clock, serving as a continuous aid to students for their inquiries.
Technology has helped overcome communication barriers, Ayedoun et al. (2015) proposed a conversational agent to help learn English as a Foreign Language (EFL) based on a model of willingness to communicate (WTC). The results showed a boost in learners’ confidence and an increase in the frequency of communicating in English (Ayedoun et al., 2015). AI Chatbots can provide a personalized learning environment and due to their simplistic UI characteristics, AI chatbots are more intuitive compared to other web-based and mobile applications (Fadhil & Villafiorita, 2017). Additionally, AI Chatbots can aid visually impaired students in their learning and educational activities (Chempavathy et al., 2022).
The Hong Kong University of Science and Technology (HKUST) implemented a chatbot using Google Dialog Flow for training teaching assistants (Gonda & Chu, 2019). The training program consisted of 550 full-time research postgraduate students who were trained as Graduate Teaching Assistants (GTAs) to assist in the teaching activities of undergraduate students (Gonda & Chu, 2019). This case study highlights some critical longstanding challenges in educational systems such as the fact that instructor-to-student ratios are often unbalanced, thereby leading to delayed feedback from instructors and the instructors being bombarded with multiple repetitive queries. HKUST experienced a similar issue where the instructor to GTA ratio was 1 to 200, making it challenging for instructors to provide timely support (Gonda & Chu, 2019). The diversity of the GTAs and multiple divisions and departments of the university made it challenging to address everyone’s needs while delivering training content (Gonda & Chu, 2019). In addition, these repetitive queries inhibited instructors’ abilities to focus on critical activities in class by taking away tremendous amounts of their time (Gonda & Chu, 2019). The GTAs would rely on receiving feedback from instructors through discussion forums, email or an automatic feedback system from LMS (Gonda & Chu, 2019). This would result in significant delays and would be a cumbersome process just to get clarifications on some basic questions such as grading criteria, submission, etc. The chatbot solution was designed to address some key hurdles encountered by GTAs and to provide dynamic support along with overcoming diversity barriers and providing real-time feedback for repetitive queries (Gonda & Chu, 2019). The case study reflects some common gaps encountered in the education system where chatbots can act as valuable tools for helping instructors with tedious and repetitive tasks along with providing students with a support mechanism to receive necessary, real-time feedback.
Recent developments in AI and ML have greatly improved the accuracy of AI Chatbots (Chempavathy et al., 2022). The need for AI Chatbots is rising in higher education due to the administrative and educational support they provide to instructors, enabling them with resources to create a personalized learning environment and enhance learning experiences for students while simultaneously reducing the overall cost (Gill et al., 2023).
Online education has unlocked a spectrum of opportunities and has provided the benefit of learning in a flexible and globally accessible environment. However, a key challenge is that despite having an abundance of features and resources available in online platforms, such learning platforms still lack functionalities that can provide students with an immediate response to their queries.
A range of commercial chatbot solutions are readily available within the business and the health sector but the education industry still lacks solutions that have integrated such tools into their learning platforms (Gonda & Chu, 2019).
The lack of functionality or a mechanism where students can communicate and receive feedback from educators in real-time can negatively impact a student’s performance and retention rate. Singapore Institute of Technology (SIT) experienced a similar issue with their Online Chemistry course where the overall completion rate was relatively low at around 15%, the key reason being that students were unable to get an immediate response from their instructors while completing the course (Atmosukarto et al., 2021). To overcome this problem, they developed an AI Chatbot using a deep-learning AI platform called Chatlayer (Atmosukarto et al., 2021). The Chatbot design was geared more towards providing guidance and clarifications to inquiries rather than giving definite answers to the assessments. The beta version of the chatbot gave promising results where students appreciated the chatbot being available 24/7, leading to the deployment of the chatbot to the production servers (Atmosukarto et al., 2021).
Traditional LMS (Learning Management System) platforms often provide structured learning and deliver standardized content for all users. They lack features that can provide customized content based on individuals learning needs or preferences (Subramanian et al., 2019). Many online learning platforms provide educational resources, however, they are integrated and contained within the platform and users do not have the flexibility to access external resources within the platform (Subramanian et al., 2019). An example could be taken of an LMS platform that delivers software engineering courses but does not have the capability to do Application Programming Interface (API) calls to external sites such as Stack Overflow, GitHub, etc. for errors, code snippets and exceptions (Subramanian et al., 2019).
Subramanian et al. (2019) conducted a pilot study to incorporate such features by designing an AI chatbot called TutorBot using the Google Dialog flow interface and leveraging techniques using ML and NLP. A total of 20 software engineers were selected from a multinational organization based in the US and India to partake in this pilot program by going through a learning path based on topics of Blockchain and Data Science. The participants were equally divided into Group A and Group B, where Group A used TutorBot and Group B used a traditional LMS and web resources to complete the training program (Subramanian et al., 2019).
After two weeks of running the program, the results reflected a 90% improvement rate in obtaining content and relevant information on the two topics by participants from group A using TutorBot; participants from group A also saved 60% of their time compared to participants from group B that used traditional methods (Subramanian et al., 2019). The study also revealed a few future enhancements that could improve the TutorBot such as integrating features for supporting multilingual conversations and refining the accuracy of the ASR (automated speech recognition) component to help participants with speech impairments (Subramanian et al., 2019). The research reflects the benefits of integrating AI Chatbots into LMS platforms to support students in broadening their research and accelerating their academic journey.
Chatbots are also gaining popularity in streamlining various tasks in the administrative domain. The research was conducted to showcase the positive value chatbots bring to the administrative processes in the educational system. The study aimed to evaluate the potential of AI chatbots in assisting students with administrative queries for seeking admission into college programs. The chatbot was developed using BotPress and leveraged NLP and NLU techniques, the web integration was performed using a website based on WordPress and a WAMP server through the Header and Footer scripts plugin of WordPress (Bhharathee et al., 2023). The responses generated from the chatbot had an accuracy of 90.6% percent and a CSAT score of 84.7% on various inquiries related to admission, accommodation, fees, etc. (Bhharathee et al., 2023). The research shows the potential of integrating AI chatbots into educational platforms to serve as virtual assistants in the administrative domain. It can serve as a tool and a resource for providing real-time information and can assist with repetitive administrative tasks from multiple sources.
Flipped education provides students with the opportunity to familiarize themselves with the course content via videos or research before interacting with teachers, this model has proven to be greatly effective where students' learning does not start from scratch (Kim & Wong, 2023). To justify the effectiveness of this model, research was conducted on 32 student participants at a private Mexican institution by implementing Bing Chat in a Mathematics for Decision-Making course (Martínez-Téllez & Camacho-Zuñiga, 2023).
In the first stage of the research, the participants were asked to explore a theoretical presentation and solve two math problems without the aid of any instructor. The second stage involved discussing the solutions and clarifying their doubts with an instructor. In the final stage, the students were asked to provide their reflections and thoughts on the effectiveness of using Bing Chat throughout the study. The overall results were positive, showing Bing Chat to be a helpful tool for students in regard to their critical thinking processes (Martínez-Téllez & Camacho-Zuñiga, 2023).
4.2 ChatGPT
ChatGPT was initially designed to handle language translation tasks but is now an AI-generative tool used for carrying conversations and generating text in real time (Qadir, 2023). It poses several advantages over other language models such as BERT and XLNet by achieving a higher accuracy model through access to enormous datasets with billions of parameters (Lund et al., 2023). Compared to other models developed by Open AI such as DALL-E which is designed for text-to-image, ChatGPT’s framework is based on a text-to-text AI generative model with the ability to have a human-like conversation with natural dialogs, making it suitable for a range of applications (Qadir, 2023).
The importance and the growing presence of ChatGPT can be seen in the health industry; for example, it is widely being used in the UK by the National Health Service (NHS) to communicate health-related information to the general public (Zaabi et al., 2023). It is also being used in other areas of the medical sector for diagnosing medical data and suggesting applicable treatment (Zaabi et al., 2023).
ChatGPT and AI models can be an asset to researchers in the education industry by examining vast amounts of data and providing predictions for future events (Ahadi et al., 2023). The predictive capabilities can significantly contribute in academic areas of public health, environmental science, economics and political science (Ahadi et al., 2023).
Jalil et al. (2023) conducted a study to assess the accuracy and practicality of using ChatGPT for answering questions for an undergraduate-level software testing course offered at George Mason University. The experiment was based on evaluating multiple questions from five chapters of the book and the study was focused on a couple of research questions that determined the degree of correctness of the answers provided by ChatGPT (Jalil et al., 2023). Findings from the research showed that ChatGPT was able to respond to 77.5% of the questions from which 55.6% of the questions answered were correct or partially correct and the explanations provided for the answers were 53% correct or partially correct (Jalil et al., 2023). The research also reflected that the accuracy level and the degree of correctness could further be improved through follow-up questions (Jalil et al., 2023).
Lund et al. (2023) researched and analyzed the scholarly writing capabilities of ChatGPT, and the results showed that the software could craft professional papers and essays with human-like language that could even exceed the expectations of a doctoral-level student.
Laato et al. (2023) carried out extensive research that was divided into three phases to demonstrate the capabilities of ChatGPT in higher education. The first part of the research involved the authors familiarizing themselves with the application for two months followed by the second stage which involved formulating use cases for ChatGPT to be used in higher education. The final part involved evaluating how ChatGPT could assist students in completing courses for a Bachelor of Computer Science degree from a university based in Finland (Laato et al., 2023). The research spanned several weeks and included vigorous testing of ChatGPT’s response on computer science-related topics such as machine learning, algorithms, microprocessors and various programming languages (Laato et al., 2023). The results revealed that ChatGPT was able to provide answers to the majority of the questions with a high level of accuracy and was also able to provide extensive answers for essay-based questions. However, it was noted that some answers lacked critical details and contained information that pertained to common definitions rather than applied concepts. Some inaccuracies were also found in responses to questions requiring practical implementation (Laato et al., 2023).
Figure 4.
Benefits of AI chatbots in education.
Figure 4.
Benefits of AI chatbots in education.
Sentiment analysis is the process of evaluating and classifying human emotions expressed in text format into categories of positive, negative or neutral (Ramanathan & Meyyappan, 2019) data. Sentiment analysis is gaining significant traction in the field of natural language processing (Tubishat et al., 2023). Research was conducted on understanding public sentiments in adopting ChatGPT in the education industry by extracting data from Twitter API using Python (Tubishat et al., 2023). The study specifically chose tweets as the source of input data for this research as they provide valuable unfiltered views of the public and also because they occur in real-time and are accessible (Tubishat et al., 2023). The dataset consisted of a total of 11830 tweets, collected over twelve days by using a unique set of keywords (Tubishat et al., 2023). The TextBlob library was used after cleaning the data to categorize the tweets into positive, negative and neutral sentiments (Tubishat et al., 2023). The results were a good reflection of the public opinions towards AI in education where 6179 tweets were in the positive category, 1688 in the negative, and 3963 in the neutral category of sentiments (Tubishat et al., 2023). The positive opinions of the public far exceeded the other sentiments, praising ChatGPT’s computing power and the consistency of delivering correct answers to various inquiries (Tubishat et al., 2023).
4.2.1 ChatGPT for LMS
Integrating ChatGPT with Learning Management Systems (LMS) can enhance the learner’s interaction through instant feedback and support (Alshahrani, 2023). Moreover, it can be tailored to accommodate personalized learning and foster an interactive learning environment that enhances learners’ engagement and retention rates, specifically in higher education where educators face challenges in providing individual assistance due to low student-teacher ratio (Alshahrani, 2023). Learning management platforms (LMS) can be enhanced by integrating ChatGPT via APIs to provide personalized support for students in answering questions, suggesting resources and generating quizzes and practice exams (Alabool, 2023).
Blended learning systems are hybrid academic models that comprise elements of online learning along with traditional face-to-face methods (Alshahrani, 2023). Blended models such as flipped classrooms and hybrid courses are becoming increasingly popular due to their flexibility and sustainability (Alshahrani, 2023). ChatGPT can increase the level of engagement in blended models and can aid in understanding complex concepts, creating an enjoyable dynamic and interactive learning experience (Alshahrani, 2023). AI Chatbots such as ChatGPT that use Natural Language Processing (NLP) can also act as virtual instructors, changing the dynamics of education by introducing flipped educational models where students can learn before the class and use the actual class time for group discussions (Gill et al., 2023). Integrating ChatGPT with educational platforms can also empower learners with disabilities to overcome barriers by tailoring course content according to their specific needs (Alabool, 2023).
4.2.2 ChatGPT for Educators
ChatGPT’s capacity to conduct conversational-style exchanges with users makes it an ideal virtual tutoring system where learners can ask questions and receive feedback in real-time (Qadir, 2023). Integrating it as a virtual teaching assistant can reduce the work of instructors by helping them respond to students' queries along with providing daily support (Yinping & Yongxin, 2023). Instructors can use it to create intelligent tutoring programs that can be customized for students, offering them a personalized learning experience (Zaabi et al., 2023). It can also aid instructors in creating content such as quizzes, presentations, course outlines, etc. (Qadir, 2023).
Furthermore, ChatGPT can serve as an online training tool for instructors to help them advance their skills through professional development resources (Yinping & Yongxin, 2023). Instructors can use it as a tool for analyzing student’s performance along with identifying patterns and trends for improving teaching strategies (Alabool, 2023).
It can be a cost-effective solution for designing educational programs along with recommending resources based on the academic program (Alabool, 2023).
ChatGPT can aid research assistants with research ideas and use methodologies that have been applied in previous studies (Bahrini et al., 2023). It can also be used as a tool for statistically analyzing data, finding relationships between data, interpreting data, and providing suggestions for future research (Bahrini et al., 2023).
Dai et al. (2023) conducted a study to determine the practicality of using ChatGPT as an assessment tool for providing feedback on student’s assignments. A dataset was taken from a postgraduate data science course from an Australian university, in which students were tasked with proposing a data science project based on a business scenario (Dai et al., 2023). The feedback to the assessments was given by instructors and ChatGPT through a series of prompts based on the following rubric points: the clarity of the project goals, the relevance of the topic to data science, details on the business benefits, creativity of the topic and the overall clarity of the proposed solution (Dai et al., 2023). The feedback on each criterion received from ChatGPT and instructors was further graded by three experts on a five-point scale, determining fluency and coherency (Dai et al., 2023). The results were promising, with ChatGPT being able to analyze students' performance and provide detailed feedback alongside suggested learning strategies that students could adopt in the future (Dai et al., 2023).
Another study was conducted to analyze the gaps in incorporating lesson plans into ChatGPT for learning activities. The lesson plans were created by 29 pre-service elementary teachers that comprised 11 males and 18 females (Lee & Zhai, 2024). Most of the participants were sophomores from the field of science at the Korean teacher’s university (Lee & Zhai, 2024). They went through a three-week on-site teacher’s training school program, enabling them to experience a classroom-like environment (Lee & Zhai, 2024). The study plans were created for various science domains such as Physics, Chemistry, Biology, Earth Science, and Environmental Science (Lee & Zhai, 2024). The pre-service teachers were given four two-hour training sessions each week (Lee & Zhai, 2024). The first session was an overview of ChatGPT and LLMS, and the second session discussed the benefits of integrating ChatGPT into science courses. The session also shed light on using ChatGPT to craft and refine scientific concepts. The third session explored various learning methodologies and teaching approaches in science. The methods included a range of concepts such as role-playing, epistemological vee, analogy generation, etc. During the fourth session, the instructor trained the pre-teachers to formulate lesson plans using the knowledge gained from prior coursework. The next phase of the study involved participants developing a lesson plan suitable for an elementary classroom lasting about 45 minutes (Lee & Zhai, 2024). They were required to incorporate at least one teaching and learning technique that they had studied in the ‘’Science Education 1” course (Lee & Zhai, 2024). The structure of the lesson plan was required to have a learning objective, course content, and an outline (Lee & Zhai, 2024). Another requirement was to have either the student or the teacher ask ChatGPT a question and the lesson plan would then be framed around its response (Lee & Zhai, 2024). Additionally, the participants were also required to include a simulated dialogue, illustrating how the teacher or students may engage with ChatGPT during the lessons. The results of the study and the performance of teacher’s lesson planning were evaluated using GenAI-TPACK which looked into four categories such as the correct way to use ChatGPT in creating lesson plans, how well the curriculum goals can be supported by ChatGPT, how seamlessly ChatGPT integrates with teaching strategies and lastly how effectively curriculum goals, methods and techniques can be designed using ChatGPT (Lee & Zhai, 2024). To gain deeper insights into pre-service teachers’ GenAI-TPACK, the participants were also asked to complete a survey that comprised three open-ended questions (Lee & Zhai, 2024). The research results were examined by two researchers using two sets of data: 1) lesson plans using a scoring rubric and 2) the responses received from the survey (Lee & Zhai, 2024). The overall results showed ChatGPT being effectively integrated into lesson plans of science subjects, covering methods such as group learning and predict-observe-explain (POE) (Lee & Zhai, 2024). Qualitative findings revealed that leveraging such strategies through ChatGPT increased student engagement and promoted thinking skills (Lee & Zhai, 2024). The pre-service teachers were good at aligning the lesson goals using ChatGPT but encountered difficulties selecting suitable features to achieve optimal results (Lee & Zhai, 2024). While they recognized the benefits and the potential of using ChatGPT, concerns were raised about its accuracy and students’ heavy reliance on it (Lee & Zhai, 2024). The research also proposed strategies to address such problems by diversifying and using information sources beyond ChatGPT (Lee & Zhai, 2024). Along with integrating teacher feedback to foster active engagement among educators and students (Lee & Zhai, 2024). This study highlights both the advantages and complexities involved in integrating ChatGPT into learning environments.
4.2.3 ChatGPT for Learners
Leveraging ChatGPT in education can foster an active learning experience by responding to questions and offering spontaneous suggestions, hence creating a framework for an interactive learning model (Gill et al., 2023). This style of learning approach is attributed to the Two Sigma Effect, where the learning is enhanced by two standard deviations compared to traditional methods (Qadir, 2023). This method of learning also aligns with the concept of mastery learning in which students learn to strengthen their foundational concepts at their own pace before moving on to advanced materials (Qadir, 2023). Learning a programming language would be an ideal use case where students can gradually strengthen their foundational concepts of data structures and algorithms through interactive learning (Arista et al., 2023). Another example could be taken of understanding complex concepts found in the field of biomedical sciences where ChatGPT can provide a simplified explanation of a difficult concept (Zaabi et al., 2023).
Integrating artificial intelligence systems along with virtual reality technologies can break the traditional barriers to learning and can also enhance student engagement (Arista et al., 2023). Students can leverage the ChatGPT interface as a system to help narrow down their research and delve deeper into a topic through a series of queries (Lund et al., 2023).
Researchers have been exploring the benefits that AI chatbots can bring to the software development industry. They have found that ChatGPT can be leveraged to generate code, translate software development requirements, formulate use case diagrams, and class diagrams along sequence diagrams for illustrating the flow between objects (Abdelfattah et al., 2023).
Speth et al. (2023) conducted a study to examine the quality of exercises generated by ChatGPT for a Java programming course at the University of Education. The research consisted of generating twelve exercise sheets ranging from beginner to intermediate level, covering various concepts following the framework of object-first method (Speth et al., 2023). The experiment was aimed at demonstrating ChatGPT’s capabilities to serve as an instructor for a Java programming course. The experiment was conducted by telling ChatGPT to play the role of a professor and the instructions specified, containing the list of topics and the expected learning goals to be achieved (Speth et al., 2023). To evaluate the accuracy and relevance of the exercises generated by ChatGPT, eight participating students were asked to fill out a questionnaire using a Likert scale at the end of the programming course (Speth et al., 2023). The results showed that the quality of content created for some topics such as flow structures, APIs and inheritance was good (Speth et al., 2023).
The questionnaire responses also showed interesting results where the majority of the students could not detect that the exercise sheets were generated by ChatGPT. Other observations showed that AI Chatbots needed precise and detailed instructions to accurately generate the content (Speth et al., 2023).
It was also seen during the experiment that if slight changes were requested in the exercises generated, ChatGPT would modify a substantial amount (Speth et al., 2023). Certain exercises such as creating UML diagrams could not be generated and the response was provided in the form of textual data (Speth et al., 2023). Some other challenges were seen in generating exercises relating to finding erroneous codes. The overall results showed positive signs of ChatGPT being used as a powerful and useful tool in universities for generating content (Speth et al., 2023).
Sudirman and Rahmatillah (2023) conducted research in Bandung to explore the benefits of ChatGPT to students in an Entrepreneurship program. The research consisted of 213 students who were divided into five classes and each class was further divided into groups of five students (Sudirman & Rahmatillah, 2023). The goal of the study was to analyze if students can use ChatGPT as a tool to generate information and use it to formulate their ideas. The participants were directed to use ChatGPT for gathering information related to designing a business mobile app for solving problems experienced by consumers or finding solutions for issues encountered due to existing technology; however, the participants were instructed not to directly ask ChatGPT to generate such an idea. The experience of the participants and the results were evaluated through a questionnaire. The results showed that many of the students found ChatGPT to be a useful tool for brainstorming ideas and getting data insights; they were keen on using it in other courses. The participants indicated they had an enjoyable experience and the overall research findings leaned towards showcasing ChatGPT as a tool for enhancing creativity and modernizing education (Sudirman & Rahmatillah, 2023).
AI Chatbot is also gaining popularity in the areas of language development by creating an adaptive learning environment that can help learners strengthen their communication skills and learn foreign languages (Kovačević, 2023). It can be geared to offer a personalized learning program to help learners improve vocabulary and grammar for English for specific purposes (ESP) students (Kovačević, 2023). The model can be trained using a dataset from English as a Second Language (ESL) and the generated exercises can be customized to meet students’ proficiency levels and learning needs (Kovačević, 2023).
4.3 Limitations and Drawbacks of ChatGPT in Education
ChatGPT poses a range of benefits in the educational sector, however, it also has its shortcomings. The following subsections delve deeper into the limitations and drawbacks associated with such technology in educational settings.
4.3.1 Ethical Implications
With the ability to generate text in natural language, academic integrity and plagiarism are key concerns in adopting ChatGPT in educational platforms. Plagiarism plays a critical role in protecting academic integrity and the ethical foundation of education. Investigations have shown that ChatGPT can reflect the information they produce is unique by surpassing well-known plagiarism detection tools such as Turnitin (Gill et al., 2023). This poses a critical concern in the education sector. ChatGPT can generate plausible data, but it does not reference the source of the data which leads to concerns involving the integrity of the information (De Silva et al., 2023). Students can impact the learning integrity by using ChatGPT to generate essays and research papers and submit them as their content (Alabool, 2023). They can also misuse the advanced AI capabilities of ChatGPT to cheat on exams and quizzes (Alabool, 2023). Ever since COVID-19, online assessments and learning have boomed at a rapid rate and have shown to offer a range of benefits however, with the emergence of ChatGPT, academic integrity might be at stake since students can use it to cheat during online exams or gain an unfair advantage against students who do not have access to such tools (Mosaiyebzadeh et al., 2023).
Accidental plagiarism is another growing concern faced by researchers when using AI Chatbots such as ChatGPT where the data generated does not contain links or references to the original source, this can result in unintentional plagiarism by not giving credit to the original author (Mosaiyebzadeh et al., 2023).
Detecting AI written content can be quite challenging even for large organizations such as OpenAI that are actively engaged in researching and developing solutions to provide transparency between AI and human written text. Advanced technology containing high computational power that can detect plagiarism in research and other sectors will be a much-needed necessity for educational organizations in the future (Memarian & Doleck, 2023). OpenAI had developed a classifier to detect AI-generated text however, with low accuracy rates of only 26%, it was discontinued in July 2023 with an aim to develop a more robust and accurate model (Kirchner et al., 2023).
The classifier had limited capabilities of producing inaccurate results with text containing less than 1000 characters and was only recommended for text written in English (Kirchner et al., 2023). LLMs often misinterpret the data which could lead to inaccurate results (Laato et al., 2023). Current LLM models such as ChatGPT are available commercially and uploading copyrighted material may violate intellectual property rights and copyright laws that could possibly lead to legal actions (Laato et al., 2023).
Responses received from a poll of 1000 college students in the US showed that 60% of students claimed to use AI chatbots for completing more than half of the coursework and 33% of the students used such tools to complete their written assignments (Arista et al., 2023). Mosaiyebzadeh et al. (2023) proposed a solution to overcoming such a problem by reverting to invigilated or conducting oral exams, reducing the risk of cheating during online exams. Wagholikar et al. (2023) suggested another approach where instructors design exams and assignments that involve supporting their answers with diagrams and graphical representations.
Figure 5.
Limitations and drawbacks of ChatGPT in education.
Figure 5.
Limitations and drawbacks of ChatGPT in education.
4.3.2. Limitations in Understanding Human Emotions
AI Chatbots can provide guidance and feedback in the form of human-like conversations; however, they lack the capabilities of understanding human emotions (Mosaiyebzadeh et al., 2023). Educators play a vital role in the education system where their core responsibilities and skills exceed far beyond just sharing knowledge. They serve as key pillars in building a supportive learning environment that is based on trust and understanding a student's needs. Over the years, educational technology has evolved drastically, helping overcome many barriers, but one aspect of education that cannot be changed by the doings of technology is the human touch of understanding emotions and building relationships.
4.3.3 Programming Challenges with ChatGPT
While ChatGPT can enhance the learning in software development industry, it also comes with its own set of limitations. Berrezueta-Guzman & Krusche (2023) conducted an experiment to test the capabilities of ChatGPT in solving programming assignments. The experiment analyzed 22 homework exercises from a first-year programming course offered at the Technical University of Munich (TUM) (Berrezueta-Guzman & Krusche, 2023). The results highlighted both the positive outcomes and drawbacks of using ChatGPT in the educational system, more specifically, in programming courses. The results revealed that ChatGPT showed inaccuracies while trying to solve problems related to JSON jobs, generating the code of a basic calculator, and was unable to solve two exercises related to ULM diagrams and implementing GUI features (Berrezueta-Guzman & Krusche, 2023).
The experiment also highlighted long-term negative impacts ChatGPT could have on students’ abilities to think critically and independently in solving problems (Berrezueta-Guzman & Krusche, 2023). There should be awareness amongst learners to use ChatGPT as a supplemental tool and not as a replacement for traditional learning techniques (Sudirman & Rahmatillah, 2023).
4.3.4 Misinformation and Source Reliability Issues
The dependency, accuracy, and biased responses of ChatGPT have raised alarming concerns among researchers about adopting and using it in educational platforms (Arista et al., 2023).
ChatGPT can produce incorrect information that may hamper the learning process and the integrity of education for users who frequently rely on such platforms (Gill et al., 2023). Producing incorrect or false information is a key concern in AI technology and is often referred to as AI hallucinations (De Silva et al., 2023). ChatGPT being trained on vast amounts of data could lead to ambiguous and inaccurate results, hindering the overall learning process (Arista et al., 2023). The inaccuracies are due to the wide range of data available on the internet that is being used to train such models (Mosaiyebzadeh et al., 2023). OpenAI also acknowledges that there may be discrepancies in the results produced by ChatGPT (Athilingam & He, 2024).
Researchers using ChatGPT could also be prone to factual inaccuracies and may unintentionally plagiarize data or fail to provide credit to the original author as the responses generated do not cite the authors nor the source of origin (Arista et al., 2023). ChatGPT uses data from various sources available on the internet for training its models and generating results, and this also poses a risk of producing incorrect or false information and lacks the functionality of verifying factual information (Mosaiyebzadeh et al., 2023). Generating incorrect factual information can severely impact the integrity of education by developing students’ learning and knowledge bases with inaccurate information.
Zuccon et al. (2023) conducted research to test the accuracy and the quality of data along with testing the correctness of the references produced by ChatGPT for the data it generates. The research pertained to the topics based on the agriculture domain and was taken from the Ag-valuate collection. The 160 topics used for this research were created by agriculture scientists and crop growers (Zuccon et al., 2023). The reason for using this collection of topics as part of this study was because all the questions and answers were readily available to the public and the information was backed up with references (Zuccon et al., 2023). Also, domain experts were available to validate the information provided by ChatGPT. The methodology involved prompting ChatGPT with questions on agricultural topics and instructing the tool to provide answers along with references to the source (Zuccon et al., 2023). The authors of the research along with a research assistant validated if the answers contained references and if the sources indicated by ChatGPT were accessible, then answers were further validated by an agricultural scientist and an expert in crop science (Zuccon et al., 2023).
The results showed that 49.4% of the answers were incorrect with 37.5% of the answers being partially correct and only 13.1% percent of the answers being completely correct (Zuccon et al., 2023). The incorrect answers also contained misleading information and some answers contained general information (Zuccon et al., 2023). It was also found that out of 160 answers, 14 answers were not supported by references despite instructing ChatGPT to include references for each answer (Zuccon et al., 2023). The majority (87%) of the references provided by ChatGPT were from Wikipedia pages and some sources cited gave an impression that articles taken from scientific journals did not exist which relates to the challenges of hallucination faced by the tool (Zuccon et al., 2023). Two scientific articles referenced by ChatGPT did not have the correct metadata such as the authors and the years (Zuccon et al., 2023). The study revealed that only 18% of the answers had the correct content that correlated with the references cited by ChatGPT (Zuccon et al., 2023).
4.3.5 Reliance on ChatGPT
While the many advantages of ChatGPT for students are evident, it is crucial to keep in mind that student dependency on ChatGPT may also lead to negative effects on them as well as instructors. Using ChatGPT daily can limit the ability to solve problems and generate ideas independently (Alabool, 2023). Relying heavily on such technology may disrupt the learning process and lead to frustration if system failures or malfunctions occur (Alabool, 2023). Also, heavy use of Chatbots such as ChatGPT can greatly impact students' critical thinking process and may hinder overall learning by providing them with tools that have the capability of generating answers without engaging in the learning process (Mosaiyebzadeh et al., 2023).
Though ChatGPT can offer consistent support and guidance, it cannot surpass the level of interaction through traditional teaching methods provided by humans (Alabool, 2023). The constant use of ChatGPT impacts the ability to create social groups and foster an environment of collaborative learning (Alabool, 2023). Overreliance on ChatGPT can also hinder the principles of research as it enables students to get accustomed to the tool and refrain from seeking other sources to conduct research (Bahrini et al., 2023).
4.3.6 Security and Privacy Concerns
Security is crucial for any organization when it comes to integrating new technology into its existing infrastructure. Privacy and data security are key concerns in integrating AI Chatbots into education systems (Mosaiyebzadeh et al., 2023). Unauthorized access can lead to data breaches and can compromise students’ personal information (Mosaiyebzadeh et al., 2023). Cyber security domains have raised concerns about attackers using ChatGPT to create phishing and using such AI tools to create code changes (Wagholikar et al., 2023). Data breaches would be another security concern due to storing sensitive data such as student grades and personal information (Bahrini et al., 2023). Integrating ChatGPT into educational platforms is a complex and rigorous process that requires a thorough evaluation of workflows and procedures (Arista et al., 2023).
4.3.7 Language Limitations and Accessibility
ChatGPT may provide an unfair advantage to students who are well-versed in the languages supported by the tool and such barriers may lead to discrimination because every individual may not be able to reap the benefits of the tool (Memarian & Doleck, 2023). Another concern arises in regions where students do not have access to such tools; this leads to those students experiencing an unfair disadvantage compared to students who have the opportunity to access and utilize such technology as part of their learning journey (Gill et al., 2023). Furthermore, not all institutions will have the resources and financial backing to integrate and test this technology in their learning platforms. This may impact the growth and reputation of educational institutions that aim to modernize their platforms.
4.3.8 Limitations with Complex Tasks
ChatGPT is also known to have limitations in handling complex tasks which may limit its use case in certain research areas (Bahrini et al., 2023). A study was conducted to demonstrate the potential of ChatGPT to serve as a viable diagnostic tool in nursing education (Gosak et al., 2024). A case study on preventive care was selected from the Train4Health and based on it, a detailed description of a patient was entered into ChatGPT, prompting the tool for information regarding nursing diagnoses, interventions and outcomes (Gosak et al., 2024).
The results showed that ChatGPT was able to analyze the case study and present the nurses with appropriate diagnoses along with planning goals and interventions related to the patient’s health. Though the tool was able to provide detailed diagnoses, the results deviated from the requested standards of the North American Nursing Diagnosis Association – International (NANDA-I) (Gosak et al., 2024).
The study revealed that ChatGPT has the capabilities of analyzing data and can be used as a guide in nursing education; however, the results also reflected that the diagnosis was not consistent with the NANDA-I standards and produced some data incorrectly (Gosak et al., 2024).