Preprint
Article

This version is not peer-reviewed.

Improving Emotional and Vocal Performance in Job Interview Simulations with an AI-Enhanced Chatbot for University Student

Submitted:

02 June 2026

Posted:

03 June 2026

You are already at the latest version

Abstract
This study examines whether AI-powered chatbot training was associated with changes in university students’ facial and vocal emotional reaction time during simulated job interviews. A one-group pretest-posttest quasi-experimental design was conducted with 54 third- and fourth-year students enrolled in a Human Talent Management course at a private Latin American university. The study was implemented in an Applied Neuroscience Laboratory using iMotions-supported facial expression recognition, eye-tracking, and vocal tone analysis technologies. Participants first completed a baseline simulated interview, followed by three chatbot-based training sessions using HR-expert-validated questions, real-time scoring, and qualitative feedback. A final simulated interview was then conducted to compare pre- and post-training emotional indicators. Facial emotional reaction time was analyzed through aggregate indicators and specific emotions, including joy, surprise, anger, sadness, disgust, fear, and contempt. Vocal emotional reaction time was examined through happiness, sadness, anger, and neutrality. Pre-post differences were assessed using paired t-tests and Wilcoxon signed-rank tests. Results showed a significant increase in positive facial emotional reaction time, from 3.52% to 14.75%, and in joy, from 2.38% to 10.10%. Vocal happiness also increased from 2.79% to 10.71%. Several negative and neutral indicators decreased after training, although some of these changes were supported mainly by the Wilcoxon test and should be interpreted cautiously. Overall, the findings suggest that chatbot-based interview training may support emotional expressiveness and vocal modulation in simulated job interview settings, offering a complementary tool for employability training in higher education.
Keywords: 
;  ;  ;  ;  ;  

1. Introduction

Artificial intelligence has fundamentally reshaped how organizations approach talent acquisition over the past decade. Resume screening, candidate profile analysis, real-time interview interpretation, and interview preparation training are now increasingly mediated by AI-driven systems, altering the dynamics of hiring for both recruiters and applicants alike (Nuzula & Amri, 2023; Callejas et al., 2014). What was once a process reliant almost entirely on human judgment has gradually incorporated layers of algorithmic support, raising new questions about efficiency, fairness, and candidate experience.
Beyond operational improvements, AI holds particular promise for reducing the structural biases that have long complicated personnel selection. Automated tools can standardize evaluation criteria, minimize inconsistencies across interviewers, and deliver objective feedback at scale — benefits that extend beyond corporate HR departments into educational institutions preparing students for professional life (Callejas et al., 2014). In contexts where equitable access to career development resources remains uneven, this democratizing potential carries meaningful weight.
Among the most practical applications of this shift is the use of AI chatbots as interview simulation environments. Unlike traditional preparation methods, these tools allow candidates to rehearse responses repeatedly, without the social pressure or evaluative anxiety that often distorts performance in real settings (Roulin et al., 2019). Verbal clarity, argumentative coherence, and non-verbal communication management — factors long identified as decisive in interview outcomes (Lievens & De Paepe, 2004) — can be refined iteratively through structured, low-stakes practice.
This kind of preparation is especially relevant for university students and early-career professionals, who frequently enter selection processes with limited exposure to high-pressure evaluation contexts. Recruiters consistently rank emotional intelligence, stress management, and collaborative ability among the most valued competencies in candidates (Howe, 2014), yet these are precisely the skills most vulnerable to disruption when anxiety takes hold. Negative emotional states — including fear and apprehension — are well-documented responses to evaluative pressure and represent concrete barriers to authentic self-presentation (Xu, 2023; Brunet & Müller, 2024; Shen, 2023).
To assess the measurable impact of AI chatbot training on these dimensions, this study employed a set of neuromarketing instruments: facial recognition software, eye-tracking technology, and voice modulation analysis. Emotional variations across four primary states — anger, sadness, happiness, and neutrality — were recorded for each participant before and after the training intervention, yielding quantifiable data on behavioral and affective change.
The results were clear. Positive emotional expressions increased from 3.52% to 14.75%, while negative expressions declined from 0.48% to 0.30%. These figures point to a tangible improvement in candidates’ emotional regulation and communicative performance following AI-assisted preparation, reinforcing the case for integrating these tools into both academic training programs and professional recruitment pipelines.

2. Background

2.1. New Demands and Digital Transformation in Recruitment Processes

The labor market has undergone profound structural changes over recent decades, driven by globalization, digitalization, and, more recently, the disruptions brought about by the pandemic. These forces have compelled human resources departments to rethink their talent attraction and selection models, incorporating technological tools capable of managing growing volumes of applicants with greater efficiency and reduced bias (Mendoza, 2021). The virtualization of hiring processes has shifted from an operational convenience to an operational necessity, reshaping both the criteria by which candidates are assessed and the channels through which recruiters and applicants interact.
Within this landscape, organizations have progressively adopted automated systems for resume screening, interview scheduling, and preliminary competency assessment. These shifts respond to a growing institutional demand for agility and precision in selection processes, where time-to-hire and the quality of the candidate-role fit have become critical performance indicators for talent management teams (Piedra Mayorga et al., 2023). The integration of digital platforms into the recruitment cycle has also redefined the profile of the contemporary recruiter, who must now combine interpersonal competencies with command of increasingly sophisticated technological tools. Far from simplifying the process, this convergence has raised the bar for candidates, who are now expected to demonstrate their capabilities in environments mediated by screens, algorithms, and automated evaluation systems (Mendoza, 2021; Piedra Mayorga et al., 2023).

2.2. Virtual Recruitment Interviews and Artificial Intelligence

The job interview remains one of the most consequential moments in any selection process. Beyond assessing technical qualifications, it gives recruiters the opportunity to observe how a candidate communicates, how they handle pressure, and how they adapt to an unfamiliar interactional context (Mocha, 2018). In this sense, the interview functions simultaneously as a cognitive and emotional test — one in which interpersonal competencies often prove as decisive as domain knowledge. Emotional intelligence, understood as the capacity to perceive, understand, and regulate one’s own emotions and those of others, has emerged as a meaningful predictor of success in high-stakes evaluative contexts (Howe, 2014). Candidates with stronger emotional self-regulation tend to project greater confidence, clarity, and composure during interviews — qualities that recruiters consistently rank among the most valued in prospective hires.
The emotional states a candidate experiences during an interview are not incidental to their performance; they shape it directly, influencing the quality of responses, body language, and the overall impression conveyed to the evaluator (Shen, 2023; Van Doorn et al., 2014). Personality traits such as perfectionism or fear of failure can intensify anxiety responses even in technically competent candidates, undermining their ability to communicate effectively and present themselves favorably (Hogan et al., 2010). As a result, the development of soft skills — stress management, active listening, empathy, and assertive communication — has become a priority for both candidates and the institutions that prepare them, given the direct impact these competencies have on selection outcomes (Roulin et al., 2019).
The consolidation of remote and hybrid work environments has accelerated the migration from in-person to virtual interviews, introducing new challenges for both evaluators and applicants (Callejas et al., 2014). Within this shift, asynchronous technology-mediated interviews have gained traction as a scalable and flexible alternative. In this format, candidates respond to pre-recorded questions without a live interviewer present, which substantially alters the emotional and communicative dynamics of the process (Sardi & Troilo, 2020; Hemamou et al., 2019). The absence of real-time feedback and the awareness of being assessed by an automated system produce distinctive emotional responses that differ markedly from those observed in traditional face-to-face settings (Ottilie et al., 2023). Concurrently, artificial intelligence has begun to function not merely as a delivery channel but as an active evaluative agent — capable of analyzing responses, detecting behavioral patterns, and generating candidate profiles with a level of granularity that was previously inaccessible to selection teams (Albassam, 2023; Upadhyay & Khandelwal, 2018). This dual development — the virtualization of the channel and the automation of assessment — defines the new terrain in which both recruiters and candidates must operate.

2.3. Multimodal Analysis Technologies in Candidate Evaluation

The emergence of multimodal analysis tools has equipped recruitment systems with an unprecedented capacity to capture and process information beyond the verbal content of candidates’ responses. Natural Language Processing (NLP) software makes it possible to evaluate, in real time, the tone, argumentative clarity, and relevance of what candidates say, identifying linguistic patterns associated with competencies such as confidence, empathy, and problem-solving ability (Huang & Rust, 2018). This analytical capacity transforms the interview into a measurable, scalable event, substantially reducing the subjectivity that has historically characterized human evaluation.
Complementing these linguistic tools, technologies such as facial recognition, eye tracking, and voice modulation analysis provide a non-verbal dimension of candidate performance that until recently proved impossible to systematize. Facial recognition software identifies subtle emotional variations and microexpressions throughout the interview, while eye-tracking technology reveals patterns of attention and concentration in response to questions of varying difficulty or sensitivity (Balconi & Cassioli, 2022). Voice modulation analysis, in turn, detects shifts in pitch, tone, and volume — direct indicators of the speaker’s confidence level and emotional stability. When integrated, these data streams produce a multimodal candidate profile that considerably enriches decision-making in selection processes, while simultaneously opening new avenues for skills development and targeted interview preparation.

2.4. Ai Chatbots as Digital Mentors in Interview Preparation

The incorporation of AI chatbots into interview preparation represents a meaningful departure from traditional readiness methods. Unlike static resources such as written guides or instructional videos, chatbots are capable of sustaining dynamic, responsive conversations, adapting in real time to the user’s answers and delivering personalized feedback at each interaction (Dixit et al., 2022). This interactional capacity positions them as genuine digital mentors — active companions in the candidate’s development process that simulate the evaluative reasoning of an experienced recruiter.
The analogy with a talent management specialist is well-founded. These systems are calibrated against criteria established by human resources professionals, enabling them to pose structured questions, assess the relevance of responses, and generate feedback cycles oriented toward continuous improvement (Chamorro-Premuzic, 2017). Research has also shown that chatbots can infer personality traits from response patterns, offering a complementary perspective on candidate profiles that goes beyond what self-reported assessments typically capture (Zhou et al., 2019). Furthermore, their round-the-clock availability and their fundamentally non-evaluative nature — in the sense that interactions carry no real-world consequences for the candidate — make them accessible training tools, free from the social pressure that tends to distort performance in actual interviews (Nawaz & Gomes, 2019).
The practical value of chatbot-based simulations lies in the opportunity for deliberate, repeated practice that they afford. The ability to complete multiple rounds of simulated interviews, receive specific feedback after each attempt, and progressively refine one’s responses supports the development of both communicative fluency and emotional regulation within a controlled environment (Roulin et al., 2019; Stephens et al., 2021). This iterative process is particularly valuable for university students and early-career professionals, who frequently enter selection processes with limited prior exposure to high-pressure evaluative settings. Chatbot simulations can replicate different interview styles — structured, situational, or competency-based — and adjust their difficulty level to match the candidate’s developmental stage, making them versatile instruments for personalized preparation (Dixit et al., 2022). When voice and emotional analysis are incorporated, these simulations extend their feedback to non-verbal dimensions of performance — voice modulation, eye contact, facial expression — aspects that candidates rarely have the opportunity to observe and address on their own (Batrinca et al., 2013). This multimodal feedback loop meaningfully amplifies the formative impact of practice, integrating the content of responses with the manner in which they are delivered into a single, coherent learning cycle.
This training approach finds a solid theoretical anchor in Kolb’s experiential learning model (2014), which conceives of learning as a cyclical process comprising four interconnected stages: concrete experience, reflective observation, abstract conceptualization, and active experimentation. Each stage performs a distinct function in transforming lived experience into applicable knowledge, making the model particularly well-suited to simulation-based interventions. In the context of chatbot-assisted interview preparation, the first simulated interview represents the concrete experience — the candidate’s initial encounter with an evaluative situation without prior preparation. The chatbot’s subsequent feedback constitutes the reflective observation phase, offering the participant a structured opportunity to identify strengths and areas for improvement. Successive practice rounds correspond to abstract conceptualization and active experimentation, stages during which candidates incorporate the recommendations received and apply them to refine their vocal modulation, emotional regulation, and response quality. The post-training interview closes the cycle, reflecting the consolidation of learning through more confident and emotionally controlled communication (Kolb, 2014). This alignment between the theoretical framework and the training design reinforces the validity of the experiential approach and helps explain the mechanisms through which iterative chatbot practice produces observable improvements in candidate performance.

2.5. Anxiety Reduction, Cognitive Feedback, and Interview Performance

The evidence on chatbot-based interview preparation points consistently toward measurable improvements across multiple performance dimensions. Prior research has documented gains in vocal modulation, response clarity, emotional stability, and the ability to handle unexpected questions — competencies that recruiters routinely weight heavily in their evaluations (Roulin et al., 2019; Stephens et al., 2021). These findings suggest that structured AI-assisted training does more than prepare candidates technically; it reshapes their emotional orientation toward the selection process itself.
Among the most significant outcomes of chatbot training is the development of self-confidence — an effect whose implications extend well beyond the immediate interview. A candidate who has rehearsed their responses across multiple simulated scenarios, received detailed feedback on both their verbal and non-verbal communication, and had the opportunity to correct mistakes in a consequence-free environment, arrives at a real interview with a psychological foundation that substantially changes how they experience the encounter (Chamorro-Premuzic, 2017; Zhang et al., 2022). This shift is not incidental. Perceived self-efficacy directly influences the quality of communication, the capacity to manage stress in the moment, and the overall impression a candidate makes on the evaluator. In this light, AI chatbots designed for interview preparation function not only as training tools but as instruments of personal development — ones that can help level the playing field for candidates with varying levels of prior experience and exposure to formal selection processes (Albassam, 2023; Nawaz & Gomes, 2019).
One of the most consistently documented benefits of this type of training is its capacity to reduce the anxiety and tension that candidates typically associate with evaluation. By offering an environment entirely free of real-world consequences, chatbots dissolve much of the social pressure inherent in traditional interviews, allowing candidates to focus on skill development without the emotional weight of external judgment (Howe, 2014). This reduction in affective load creates a more cognitively receptive state — one in which feedback can be absorbed and integrated far more effectively than it would be under stress.
The cognitive feedback that chatbots provide operates as the closing mechanism of the learning cycle, connecting observed performance to concrete, actionable recommendations. This process transforms emotion — often disruptive in evaluative contexts — into useful information for improvement, strengthening the candidate’s self-awareness and capacity for self-regulation (Shen, 2023). As candidates accumulate practice experiences accompanied by constructive feedback, their perceived self-efficacy grows, setting in motion a productive cycle in which confidence reinforces performance and performance reinforces confidence. This dynamic is especially relevant for candidates prone to perfectionism or with a history of evaluative anxiety, who may find in chatbot training a structured space to recalibrate their relationship with assessment before facing the pressures of a real selection process (Hogan et al., 2010; Van Doorn et al., 2014).

2.6. Research Questions and Objectives

The present study examines the effect of AI-powered chatbot training on the emotional responses, vocal expressiveness, and interview readiness of university students in simulated job interviews, with emphasis on facial and vocal manifestations. The simulations were designed to replicate the high-demand conditions characteristic of real selection processes. Data collection was conducted at the Applied Neuroscience Laboratory of a private Latin American university, using three complementary technologies: a) an eye-tracking system to monitor visual attention and cognitive engagement; b) a facial emotion recognition tool to identify a broad spectrum of affective states; and c) vocal modulation analysis software to detect emotional tone and expressiveness during speech.
Emotional responses were operationalized through two key indicators: a) the Facial Emotional Reaction Time Proportion (%), defined as the percentage of the total interview time during which affectively significant facial expressions were detected; and b) the Vocal Emotional Expression Time Proportion (%), corresponding to the percentage of speaking time in which emotional vocal markers were present. The facial categories analyzed include joy, sadness, anger, fear, surprise, disgust, contempt, confusion, sentimentality, and neutrality, while the vocal categories include happiness, sadness, anger, and neutrality.
Within this framework, the study poses the following research question:
To what extent does training with an AI-powered chatbot improve emotional regulation and vocal modulation in university students during simulated job interviews, as assessed through applied neuroscience tools?
To address this question, the following hypotheses are proposed:
H1: 
Iterative AI-powered chatbot training will significantly increase positive emotional facial expressions, particularly joy, after the intervention compared to baseline.
H2: 
Iterative AI-powered chatbot training will significantly increase positive vocal expressiveness, particularly vocal happiness, after the intervention compared to baseline.
H3: 
Iterative AI-powered chatbot training will significantly reduce negative facial and vocal emotional expressions, specifically contempt, disgust, sadness, and anger-related vocal markers.
H4: 
Iterative AI-powered chatbot training will significantly reduce excessive facial and vocal neutrality, indicating a shift toward more expressive and adaptive communication patterns.
The primary objective of the study is to determine whether this training modality, combined with emotional and cognitive monitoring, improves students’ affective regulation and vocal expressiveness, thereby enhancing their preparedness for actual selection processes. To this end, four specific objectives are pursued: a) to evaluate the overall effect of chatbot-based training on emotional response times by comparing pre- and post-intervention measurements; b) to identify shifts in facial emotional reactions toward more adaptive affective profiles; c) to analyze changes in vocal expressiveness in terms of emotional richness and speech dynamism; and d) to examine whether chatbot-based training reduces excessive facial and vocal neutrality as an indicator of more expressive and adaptive communication patterns.
Statistical analysis employed the paired-samples Student’s t-test as the primary method, with recourse to non-parametric testing in cases where normality assumptions were not met. All contrasts were conducted at a 95% confidence level, following prior verification of normality and homogeneity of variance assumptions.

3. Methods and Materials

This study was conducted with 54 third- and fourth-year university students enrolled in a Human Talent Management course at a private Latin American university. The research was carried out in the Applied Neuroscience Laboratory, using iMotions-supported facial expression recognition, eye-tracking, and vocal tone analysis technologies. The purpose of the study was to examine whether AI-powered chatbot training was associated with changes in students’ facial and vocal emotional reaction time during simulated job interviews.

3.1. Research Design

This study followed a one-group pretest-posttest quasi-experimental design. The same group of students was assessed before and after an AI-powered chatbot training intervention. No control group was included. Therefore, the study was designed to examine within-participant changes associated with the intervention, rather than to establish definitive causal effects.
The independent variable was participation in the chatbot-based interview training process. The dependent variables were the proportions of facial and vocal emotional reaction time recorded during the simulated job interviews. Facial indicators included positive, negative, neutral, confusion, sentimentality, joy, surprise, anger, sadness, disgust, fear, and contempt. Vocal indicators included happiness, sadness, anger, and neutrality.
This design was appropriate for the purpose of the study because the main objective was to observe whether students showed measurable changes in emotional expressiveness and vocal modulation after completing a structured chatbot-based training process. Accordingly, the findings are interpreted as pre-post changes associated with chatbot-based training, not as causal effects attributable exclusively to the intervention.

3.2. Participants

Participants were recruited from a Human Talent Management course because the course content was directly related to recruitment, selection, professional communication, and job interview preparation. This made the simulated interview activity academically relevant for the participants and consistent with the professional competencies addressed in the course.
The final sample consisted of 54 university students from the third and fourth academic years. All participants completed the same intervention structure, including the baseline simulated interview, the chatbot-based training sessions, and the post-training simulated interview.
The sample size was considered suitable for the exploratory and applied nature of the study. In behavioral and psychological studies using within-participant or repeated-measures designs, samples of this size are commonly used to identify moderate pre-post changes, particularly because each participant serves as his or her own comparison unit. This design reduces the influence of individual baseline differences and increases sensitivity to change over time.
Although the sample was adequate for examining the main pre-post patterns in facial and vocal emotional reaction time, the study does not claim that the sample is sufficient to detect small effects or to generalize the findings to all university populations. For this reason, the results should be interpreted as evidence from an applied behavioral study and should be confirmed in future research with larger samples, control groups, and participants from different academic and institutional contexts.
Participation was voluntary. Students were informed about the academic purpose of the study, the nature of the simulated interview activities, and the use of facial, vocal, eye-tracking, and chatbot interaction data for research purposes. All data were analyzed in aggregated form, and no individual participant was identified in the results.

3.3. Ethical Considerations and Data Protection

The research protocol was reviewed by the Ethics Committee in Research at Universidad Privada Boliviana. The protocol was classified as eligible for exemption from formal ethical evaluation because it involved a minimal-risk, non-invasive educational training activity based on simulated job interviews, voluntary participation, informed consent, and de-identified data analysis.
Although facial expression recognition, vocal tone analysis, and eye-tracking technologies were used, these tools were employed only to record behavioral and emotional indicators during the simulated interview process. The study did not involve clinical procedures, physical risk, psychological treatment, or invasive measurement techniques.
All data were used exclusively for academic research purposes. The database was de-identified before analysis, and the results were reported only in aggregated form. Informed consent was obtained from all participants before their participation.

3.4. AI-Powered Chatbot Design and Calibration

The AI-powered chatbot was designed specifically to support simulated job interview training. Its function was to reproduce a structured interview environment and provide students with immediate feedback on the quality of their answers.
The chatbot was built around five core interview questions commonly used in recruitment and selection processes. These questions were reviewed and validated by a panel of ten experts with doctoral-level training and professional experience in organizational psychology, organizational development, human resources selection, recruitment, and personnel induction. The expert panel had experience either in academic work related to human talent management or in professional practice with companies and recruitment processes.
The validation process focused on four main aspects: the professional relevance of the questions, the clarity of the wording, the realism of the simulated interview situation, and the usefulness of the questions for evaluating students’ communicative performance. The experts reviewed whether the questions reflected situations commonly faced by candidates in employment interviews and whether they were appropriate for assessing clarity, coherence, professional vocabulary, emotional tone, and confidence in oral expression.
All students received the same intervention structure. The chatbot used the same base prompt throughout the study to ensure consistency in the training process. Across the sessions, the questions maintained the same purpose and level of difficulty, although minor wording variations were introduced to reduce memorization and encourage more spontaneous responses. This allowed the intervention to remain standardized while also resembling the natural variation of a real job interview.
The chatbot evaluated each response using a 10-point scale. The scoring criteria included clarity, coherence, conciseness, vocabulary, emotional tone, and adequacy of the answer in relation to the expected response. After each answer, the chatbot provided qualitative feedback similar to the type of guidance that could be offered by a Human Resources specialist after a mock interview. The feedback focused on helping students organize their answers more clearly, use professional language, respond directly to the question, and project a more confident and positive communicative style.
At the end of each chatbot interaction, an overall score was calculated as the average of the five individual response scores. This score was used as formative feedback during the training process. However, the chatbot score was not treated as the main outcome variable in the present study. The main outcomes were the facial and vocal emotional reaction time indicators recorded in the laboratory during the pre-training and post-training simulated interviews.
The calibration process followed an iterative sequence. First, the expert panel reviewed the interview questions and evaluation criteria. Second, trial interactions were conducted to verify whether the chatbot produced coherent, constructive, and professionally relevant feedback. Third, adjustments were made to the wording of the questions, the tone of the recommendations, and the scoring logic. Finally, the prompt structure and question format were fixed before the intervention to ensure that all participants received a comparable training experience (see Table 1).

3.5. Procedure and Intervention

The intervention followed a structured sequence consisting of a pre-training assessment, three chatbot-based training sessions, and a post-training assessment.
First, participants completed a baseline simulated interview before receiving chatbot-based training. During this pre-training assessment, students answered a standard set of interview questions. Facial expression recognition, eye-tracking, and vocal tone analysis technologies were used to record the initial facial and vocal emotional indicators.
Second, students participated in three chatbot-based training sessions. In each session, they answered five interview questions and received immediate numerical and qualitative feedback from the chatbot. The feedback addressed the clarity, coherence, emotional tone, vocabulary, conciseness, and suitability of their responses. The second and third sessions allowed participants to apply the feedback received in previous interactions and progressively refine their answers.
Third, participants completed a post-training simulated interview. This final assessment followed the same general structure as the baseline interview. Facial expression recognition, eye-tracking, and vocal tone analysis were again used to record emotional and vocal indicators.
The pre-training and post-training measurements were then compared to examine whether students showed changes in facial and vocal emotional reaction time associated with the chatbot-based training (see Table 2).

3.6. Laboratory Technologies and Data Collection

The study was conducted in an Applied Neuroscience Laboratory equipped with technologies for facial expression analysis, eye-tracking, and vocal tone analysis. These tools were used during the pre-training and post-training simulated interviews to collect behavioral and emotional indicators.
Facial expression analysis was conducted using iMotions-supported facial recognition technology. This system allowed the identification of facial emotional reaction time across aggregate and disaggregated categories. Aggregate indicators included positive, negative, neutral, confusion, and sentimentality emotional facial reaction time. Disaggregated indicators included joy, surprise, anger, sadness, disgust, fear, and contempt.
Eye-tracking was used as part of the laboratory setup to support the interpretation of facial expression data. In this study, eye-tracking was relevant because facial expressions were recorded while participants interacted with visual interview stimuli. Eye-tracking helped verify participants’ visual attention and engagement during the task, providing contextual support for interpreting facial emotional responses. It was not treated as a primary statistical outcome in the present analysis, but as complementary information supporting the facial expression analysis.
Vocal tone analysis was used to assess emotional characteristics in students’ speech during the simulated interviews. The vocal indicators analyzed were happiness, sadness, anger, and neutrality. These indicators were expressed as proportions of emotional vocal reaction time during the recorded interview.
For both facial and vocal indicators, emotional reaction time was expressed as a percentage. This percentage represented the proportion of recorded interview time during which a given emotional category was detected by the analysis system (see Table 3).

3.7. Variables and Measures

The main outcome variables were facial emotional reaction time and vocal emotional reaction time. Both were expressed as percentages of the recorded interview time.
Facial emotional reaction time was analyzed at two levels. First, aggregate emotional indicators were examined, including positive, negative, neutral, confusion, and sentimentality emotional facial reaction time. Second, disaggregated facial emotions were analyzed, including joy, surprise, anger, sadness, disgust, fear, and contempt. This two-level structure allowed the study to identify both general emotional patterns and specific facial expressions.
Vocal emotional reaction time was analyzed through four indicators: happiness, sadness, anger, and neutrality. These categories were selected because they represent relevant vocal-emotional states in interview contexts, where candidates are expected to communicate confidence, emotional stability, and professional engagement.
The unit of analysis was the participant. For each student, pre-training and post-training values were obtained for every facial and vocal indicator. Higher values indicated a greater proportion of time in which a given emotional category was detected during the simulated interview (see Table 4).

3.8. Data Analysis

The statistical analysis was designed to compare pre-training and post-training emotional reaction time indicators. Because the same participants were measured before and after the chatbot-based training, all comparisons were treated as paired comparisons.
Descriptive statistics were first calculated for each facial and vocal indicator. Pre-training and post-training means were reported to describe the direction and magnitude of change. Although the variables were expressed as percentages, means were retained because they provide a clear descriptive comparison of the pre-post differences and are commonly reported in applied behavioral research. The interpretation also considered the consistency of change across participants.
The paired t-test was used to assess mean differences between pre-training and post-training scores. This test was appropriate because the study compared two related measurements from the same participants. However, since emotional reaction time indicators are bounded percentages and may present asymmetric distributions or values close to zero, a non-parametric complementary analysis was also included.
The Wilcoxon signed-rank test was used as the non-parametric paired comparison. This test evaluates whether the distribution of pre-post differences is centered around zero and is appropriate when normality assumptions for paired differences may not be fully met. In this study, Wilcoxon was particularly relevant for indicators with low baseline values or non-normal distributions, such as sadness, disgust, contempt, and anger-related expressions.
The results tables report pre-training and post-training means, paired t-test statistics, paired t-test p-values, Wilcoxon signed-rank Z values, Wilcoxon p-values, and a binary indicator of whether the pre-post difference was statistically significant. Statistical significance was interpreted using the following thresholds: ***p < 0.01, **p < 0.05, and *p < 0.10.
When both the paired t-test and the Wilcoxon signed-rank test were significant, the result was interpreted as stronger evidence of change. When only the Wilcoxon test was significant, the finding was interpreted more cautiously as evidence of consistent participant-level change rather than a large average difference. This distinction was important because some emotional indicators may change consistently across participants without producing a large mean difference.
Given the absence of a control group, the study does not claim definitive causal effects. The results were interpreted as pre-post changes associated with chatbot-based training. This interpretation is consistent with the one-group pretest-posttest quasi-experimental design and recognizes that factors such as practice, familiarity with the interview format, or repeated exposure may also have contributed to the observed changes (see Table 5).

4. Results

This section reports the pre- and post-training results for facial and vocal emotional reaction time during simulated job interviews. The analysis examines whether AI-powered chatbot training was associated with changes in students’ emotional expressiveness, emotional regulation, and vocal modulation.
Data were collected in the Applied Neuroscience Laboratory of a private Latin American university using facial expression recognition software, vocal tone analysis tools, and eye-tracking devices. Facial and vocal indicators were analyzed as percentages of emotional reaction time during the interview simulations. Eye-tracking was used as complementary contextual information to support the interpretation of participants’ engagement during the task.
Pre- and post-training differences were assessed using the paired t-test and the Wilcoxon signed-rank test. The paired t-test was used to examine mean differences, while the Wilcoxon signed-rank test was included as a non-parametric complementary analysis because emotional reaction time variables are bounded percentages and may not satisfy normality assumptions. The results tables report mean values, test statistics, p-values, and whether each difference was statistically significant.
The findings are presented in two subsections. Section 4.1 reports facial emotional reaction time, including both aggregate and disaggregated emotional indicators. Section 4.2 presents vocal emotional reaction time, focusing on happiness, sadness, anger, and neutrality. This organization allows the analysis to distinguish between general emotional patterns and specific emotional responses observed before and after the chatbot-based training.
The results are presented in relation to the research question and the four hypotheses formulated in Section 2.6, distinguishing between facial emotional reactions and vocal emotional modulation before and after the AI-powered chatbot training.

4.1. Facial Emotional Reaction Time in Simulated Job Interviews

Facial expressions provide relevant indicators of emotional involvement, self-presentation, and psychological readiness during simulated job interviews. In this study, facial emotional reaction time was analyzed before and after the AI-powered chatbot training in order to identify changes in positive, negative, neutral, and specific emotional expressions.
Table 6 presents the results for facial emotional reaction time. The paired t-test was used to compare mean differences between pre- and post-training scores, while the Wilcoxon signed-rank test was included as a complementary non-parametric analysis. This approach was considered appropriate because emotional reaction time variables are expressed as percentages and may not fully meet normality assumptions.
Overall, the results suggest that chatbot-based training was associated with an increase in positive facial expression and a reduction in several negative or less adaptive facial indicators. The strongest changes were observed in positive emotional facial reaction time and joy, both of which increased significantly after training. Other indicators, such as negative emotion, confusion, sadness, disgust, and contempt, showed significant changes mainly in the Wilcoxon signed-rank test, suggesting that some effects may be better captured at the participant-level rank distribution than through mean differences alone.
Overall, these findings provide support for H1 and partial support for H3 and H4, as chatbot training increased positive facial expressions, particularly joy, while reducing several negative and neutral facial indicators.

4.1.1. Positive Emotional Expressions

The Proportion of Positive Emotional Facial Reaction Time increased from 3.52% before training to 14.75% after training. This difference was statistically significant in both the paired t-test and the Wilcoxon signed-rank test (t = -3.16, p < 0.001; Z = -3.05, p = 0.002).
This result indicates that participants displayed a higher proportion of positive facial expressions after completing the chatbot-based training. The convergence between the parametric and non-parametric tests strengthens the interpretation that the intervention was associated with a meaningful increase in positive facial expressiveness.
The disaggregated results show that joy followed a similar pattern. The Proportion of Joy Emotional Facial Reaction Time increased from 2.38% to 10.10%, with statistically significant results in both tests (t = -2.95, p < 0.001; Z = -2.93, p = 0.003). This finding suggests that the increase in overall positive facial expression was mainly driven by greater displays of joy after training.
In simulated job interview contexts, this change may be relevant because facial expressions associated with joy can contribute to perceptions of enthusiasm, approachability, and confidence. Among the facial indicators analyzed, positive expression and joy showed the most consistent improvement across both statistical tests.
In contrast, surprise remained practically unchanged, moving from 7.48% to 7.45%. This difference was not statistically significant (t = 0.01, p = 0.500; Z = 0.12, p = 0.904). Similarly, fear decreased slightly from 6.61% to 6.11%, but the change was not significant (t = 0.16, p = 0.870; Z = 1.37, p = 0.171). These results suggest that the chatbot training did not substantially modify emotional responses that may be more closely related to uncertainty, novelty, or evaluative stress.
Anger increased slightly from 0.66% to 0.84%, but this change was not statistically significant (t = -0.47, p = 0.640; Z = -0.62, p = 0.535). Given the low baseline level of anger-related facial expression, this result suggests that anger remained relatively stable and was not meaningfully affected by the intervention.

4.1.2. Negative Emotional Expressions

The Proportion of Negative Emotional Facial Reaction Time decreased from 1.82% before training to 0.70% after training. The paired t-test showed a marginal result (t = 1.49, p = 0.070), while the Wilcoxon signed-rank test indicated a statistically significant difference (Z = 2.58, p = 0.010).
This finding suggests a reduction in negative facial expression after chatbot training. However, the result should be interpreted with some caution because the two tests do not fully converge. The marginal t-test indicates that the average reduction was moderate, whereas the Wilcoxon result suggests that the direction of change was sufficiently consistent across participants to reach statistical significance.
At the disaggregated level, sadness decreased from 0.48% to 0.30%. The paired t-test did not show a significant difference (t = 0.71, p = 0.470), but the Wilcoxon signed-rank test indicated a significant change (Z = 2.67, p = 0.008). This suggests that, although the average decrease was small, several participants may have shown consistent reductions in sadness-related facial expression.
Disgust also decreased, from 0.45% to 0.06%. The paired t-test showed a marginal result (t = 1.31, p = 0.100), while the Wilcoxon test was statistically significant (Z = 2.19, p = 0.029). This pattern indicates that the reduction in disgust was not strongly reflected in the mean comparison but was evident in the non-parametric participant-level analysis.
Similarly, contempt decreased from 3.34% to 0.62%, and this difference was statistically significant in both tests (t = 1.76, p = 0.040; Z = 2.11, p = 0.035). This is a relevant finding because contempt-related facial expression may be interpreted negatively in interview settings, as it can be associated with detachment, arrogance, or low interpersonal openness.
Taken together, these findings suggest that the chatbot-based training was associated with a reduction in several negative facial indicators. Nevertheless, because some effects were supported mainly by the Wilcoxon test, these results should be interpreted as evidence of consistent individual-level changes rather than uniformly large average effects.

4.1.3. Neutral and Confusion Expressions

The Proportion of Neutral Emotional Facial Reaction Time decreased from 83.17% before training to 70.18% after training. This reduction was statistically significant in both the paired t-test and the Wilcoxon signed-rank test (t = 1.70, p = 0.050; Z = 2.08, p = 0.038).
This result suggests that participants displayed less facial neutrality after completing the chatbot training. In interview contexts, neutrality can reflect composure and emotional control; however, excessive neutrality may also be perceived as low enthusiasm, limited engagement, or reduced interpersonal expressiveness. Therefore, the observed reduction may indicate a shift toward a more expressive facial presentation.
The Proportion of Confusion Emotional Facial Reaction Time decreased from 1.07% to 0.70%. The paired t-test did not show a statistically significant difference (t = 0.54, p = 0.600), but the Wilcoxon signed-rank test indicated a significant change (Z = 2.42, p = 0.016).
This result suggests that the average decrease in confusion was small, but the direction of change may have been consistent among participants. In the context of interview training, a reduction in confusion-related facial expression may reflect improved familiarity with the interview format, better understanding of expected responses, or greater confidence when facing evaluative questions.

4.1.4. Sentimentality

The Proportion of Sentimentality Emotional Facial Reaction Time increased from 1.09% before training to 2.18% after training. This change was statistically significant in both the paired t-test and the Wilcoxon signed-rank test (t = -1.64, p = 0.050; Z = -2.01, p = 0.044).
This finding suggests that participants showed a higher proportion of sentimentality-related facial expression after the chatbot-based training. The interpretation of this result should be cautious, as sentimentality is not necessarily equivalent to positive emotional performance. However, in interview contexts, moderate increases in this type of expression may reflect greater emotional openness, sincerity, or interpersonal sensitivity.
From a practical perspective, this change may indicate that participants became more emotionally expressive after the intervention, not only through joy but also through more nuanced affective displays. This may be relevant in simulated interviews that require candidates to communicate personal experiences, motivation, or interpersonal skills.

4.2. Emotional Vocal Reaction Time in Simulated Job Interviews

Vocal expression is an important component of communication during job interviews, as it can convey confidence, emotional stability, enthusiasm, and interpersonal disposition. In simulated interview contexts, changes in vocal tone may provide relevant information about how candidates manage evaluative pressure and how they project themselves during the interaction.
Table 7 presents the results for emotional vocal reaction time before and after the AI-powered chatbot training. The analysis includes both the paired t-test and the Wilcoxon signed-rank test. The paired t-test was used to compare mean differences between pre- and post-training scores, while the Wilcoxon test was included as a non parametric alternative, considering that emotional vocal reaction times are expressed as percentages and may not follow a normal distribution.
Overall, the results indicate an increase in positive vocal expression, particularly happiness, and a reduction in vocal neutrality, sadness, and anger related vocal patterns after the chatbot based training. However, the strength of the evidence varies across indicators, especially in those cases where significance was observed only in the non-parametric test.
Overall, these findings support H2 and provide partial support for H3 and H4, as vocal happiness increased after training, while vocal neutrality and some negative vocal markers showed reductions.

4.2.1. Happiness Vocal Modulation

The Proportion of Happiness Emotional Vocal Reaction Time increased from 2.79% before training to 10.71% after training. This difference was statistically significant in both the paired t-test and the Wilcoxon signed-rank test (t = -3.49, p < 0.001; Z = -3.21, p = 0.001).
This result suggests that participants expressed a higher proportion of positive vocal tone after completing the chatbot training. The consistency between both statistical tests strengthens the interpretation that the intervention was associated with a meaningful increase in vocal happiness.
In the context of simulated job interviews, this change may be relevant because a more positive vocal tone can contribute to perceptions of enthusiasm, confidence, and communicative engagement. Among the vocal indicators analyzed, happiness showed the clearest improvement after training.

4.2.2. Neutrality in Vocal Modulation

The Proportion of Neutrality Emotional Vocal Reaction Time decreased from 66.67% before training to 62.72% after training. The paired t-test showed a marginal result (t = -1.67, p = 0.100), whereas the Wilcoxon signed-rank test indicated a statistically significant difference (Z = 1.97, p = 0.049).
This result suggests a modest reduction in vocal neutrality after the training. Although the mean difference was not statistically significant under the parametric test, the Wilcoxon result indicates that the direction of change was consistent enough across participants to reach statistical significance in the non parametric analysis.
From a practical perspective, a decrease in vocal neutrality may indicate a shift toward a more expressive and less monotonous speaking style. In interview situations, excessive neutrality may be interpreted as limited engagement or low enthusiasm; therefore, this reduction may represent a favorable change in candidates’ vocal presentation.

4.2.3. Sadness in Vocal Modulation

The Proportion of Sadness Emotional Vocal Reaction Time decreased slightly from 13.84% before training to 12.71% after training. The paired t-test did not show a statistically significant difference (t = -0.24, p = 0.400). However, the Wilcoxon signed-rank test indicated a significant result (Z = 1.97, p = 0.049).
This finding should be interpreted with caution. The non-significant t-test suggests that the average reduction in vocal sadness was small. Nevertheless, the Wilcoxon result indicates that some participants may have shown consistent reductions in sadness-related vocal expression after the training.
Although the magnitude of change was limited, this result may still be relevant in the context of interview preparation. Vocal sadness can be associated with insecurity, discomfort, or reduced confidence. Therefore, even moderate reductions in this vocal pattern may reflect an improvement in emotional self-regulation during simulated interviews.

4.2.4. Anger in Vocal Modulation

The Proportion of Anger Emotional Vocal Reaction Time decreased from 1.64% before training to 1.41% after training. The paired t-test did not show a statistically significant difference (t = 0.54, p = 0.600), while the Wilcoxon signed-rank test showed a significant result (Z = 2.46, p = 0.014).
As in the case of sadness, this result should be considered carefully. The mean difference was small, which explains the non-significant t-test. However, the Wilcoxon result suggests that reductions in anger related vocal patterns may have occurred consistently among some participants.
Because anger-related vocal expression was low at baseline, large changes were not expected. Even so, a reduction in vocal cues associated with tension, harshness, or emotional rigidity may be beneficial in interview contexts, where candidates are expected to communicate composure and emotional control.
Taken together, the vocal results suggest that the chatbot based training was associated with improvements in participants’ vocal emotional expression. The strongest evidence was observed for vocal happiness, which increased significantly in both statistical tests. This finding supports the interpretation that participants became more capable of projecting a positive and engaged vocal tone after the intervention.
The reductions in neutrality, sadness, and anger were more moderate. These changes were mainly supported by the Wilcoxon signed-rank test, suggesting that they may reflect consistent individual level changes rather than large average differences. For this reason, these results should be interpreted as promising but less robust than the increase observed in vocal happiness.
Overall, the findings indicate a shift from more neutral or restrained vocal patterns toward a more positive and expressive vocal style. This supports the relevance of AI-powered chatbot training as a tool for strengthening communicative readiness in simulated job interview settings, while also highlighting the need for future studies with larger samples and participant-level longitudinal analyses.

5. Discussion

The findings of this study suggest that AI-powered chatbot training was associated with favorable pre-post changes in university students’ facial and vocal emotional reaction time during simulated job interviews. Overall, participants showed higher levels of positive emotional expression after the intervention, particularly in facial joy and vocal happiness. At the same time, several negative and neutral indicators decreased, although the magnitude and statistical consistency of these changes varied across emotional categories and modalities.
These results should be interpreted within the scope of a one-group pretest-posttest quasi-experimental design. Since no control group was included, the study does not claim definitive causal effects. Rather, the findings provide evidence of within-participant changes associated with chatbot-based interview training. This distinction is important because improvements may also be influenced by repeated exposure to the interview format, greater familiarity with the questions, and the opportunity to practice in a structured environment.
In relation to the hypotheses proposed in Section 2.6, the findings provide strong support for H1 and H2. Positive facial emotional reaction time and joy increased after training, and vocal happiness also showed a clear increase. These were among the most consistent results, as they were supported by both the paired t-test and the Wilcoxon signed-rank test. The findings also provide partial support for H3 and H4, since several negative and neutral indicators decreased after the intervention. However, these results should be interpreted with more caution because some changes were significant mainly in the Wilcoxon signed-rank test, suggesting consistent participant-level changes rather than uniformly large mean differences.
Taken together, the results indicate that chatbot-based interview training may support a shift from more neutral or restrained emotional patterns toward a more positive and expressive communicative profile. This shift was observed in both facial and vocal indicators, suggesting that the intervention may have contributed to students’ emotional expressiveness and vocal modulation during the simulated interview task.

5.1. The Mechanism of Emotional Shift: Experiential Learning and Affective Regulation

Job interviews expose candidates to an evaluative situation in which they must organize their responses, manage pressure, and present themselves in a confident and coherent manner. For university students and early-career candidates, this type of situation can be particularly demanding because many have limited experience with formal selection processes. In the present study, the pre-training measurements showed a predominantly neutral facial profile, together with the presence of some negative emotional indicators such as sadness, disgust, and contempt. These patterns should not be interpreted as direct evidence of clinical anxiety, since anxiety was not measured as a psychological construct in this study. However, they may reflect the emotional demands commonly associated with simulated evaluative contexts.
After chatbot-based training, participants showed increases in positive facial expressions, particularly joy, and reductions in several negative or neutral indicators. This pattern is consistent with the principles of experiential learning. The chatbot offered students repeated practice, immediate feedback, and the opportunity to adjust their responses across sessions. In this sense, the intervention created a structured learning cycle in which participants could first experience the interview situation, then reflect on their responses through feedback, and finally apply improvements in subsequent attempts.
The observed changes may also be understood in terms of affective regulation. By becoming more familiar with the interview structure and receiving guidance on how to improve their answers, students may have been better able to organize their communication and reduce less adaptive emotional reactions. The chatbot did not function as a substitute for human evaluation, but as a training mechanism that supported practice, self-correction, and more deliberate self-presentation.
The use of iMotions-supported facial expression recognition, eye-tracking, and vocal tone analysis provided objective behavioral indicators to examine these changes. These technologies allowed the study to capture emotional and vocal patterns that are difficult to observe through self-report alone. Eye-tracking also provided contextual support for interpreting facial expression data, since it helped verify participants’ visual engagement with the interview stimuli.
From the perspective of affective computing, this study illustrates how AI-supported training environments can be combined with behavioral measurement technologies to examine emotional expression in applied educational contexts. The results do not imply that students fully mastered emotional regulation after the intervention, but they do suggest that structured chatbot-based practice was associated with more positive and adaptive emotional displays during the final simulated interview.
Among the facial indicators, the most consistent changes were observed in positive emotional facial reaction time and joy. Positive emotional facial reaction time increased from 3.52% to 14.75%, while joy increased from 2.38% to 10.10%. These changes were statistically significant in both tests, which strengthens their interpretation as robust pre-post differences. The reduction in contempt was also relevant because it was supported by both the paired t-test and the Wilcoxon signed-rank test. Other reductions, such as sadness, disgust, and confusion, should be interpreted more cautiously, as they were mainly supported by the Wilcoxon signed-rank test.

5.2. Multimodal Coherence and Vocal Expressiveness: Projecting Professionalism Under Evaluative Pressure

Facial expressions and vocal modulation are complementary channels of emotional communication. While facial expression provides visual information about affective involvement, voice carries emotional tone through rhythm, intensity, and modulation. In simulated job interviews, both channels are relevant because candidates are evaluated not only by what they say, but also by how they communicate confidence, engagement, and emotional stability.
The vocal findings complement the facial results by showing changes in how participants expressed emotion through speech. The clearest vocal result was the increase in happiness, which rose from 2.79% before training to 10.71% after training. This difference was significant in both the paired t-test and the Wilcoxon signed-rank test, making it the strongest result in the vocal analysis. This suggests that, after training, participants tended to use a more positive and expressive vocal tone during the simulated interview.
The reductions in vocal neutrality, sadness, and anger were more moderate. Vocal neutrality decreased from 66.67% to 62.72%, while sadness and anger also showed slight reductions. However, in these indicators, the Wilcoxon signed-rank test provided stronger evidence than the paired t-test. This pattern suggests that some changes may have occurred consistently at the participant level, even though the mean differences were smaller. For this reason, these findings should be interpreted as promising but less robust than the increase in vocal happiness.
The decrease in vocal neutrality is particularly relevant from a communication perspective. A neutral tone can reflect composure, but excessive neutrality may also be perceived as monotony, low engagement, or limited enthusiasm. Therefore, the reduction in vocal neutrality may indicate a modest movement toward a more expressive and dynamic speaking style. Similarly, reductions in sadness and anger-related vocal patterns may suggest improved emotional control in some participants, although these results require cautious interpretation due to the limited magnitude of change.
Taken together, the facial and vocal results suggest a general movement toward a more positive and expressive communicative profile after chatbot-based training. This does not mean that the intervention fully transformed students’ interview performance, but it does indicate that structured practice with immediate feedback may be associated with favorable changes across more than one channel of emotional expression.
This multimodal pattern is relevant because interview performance is not built only through verbal content. Recruiters and evaluators also attend to tone of voice, facial engagement, and the overall coherence between what a candidate says and how that message is expressed. In this sense, the study contributes to understanding how AI-supported training may help students prepare not only the content of their answers, but also the emotional and vocal manner in which they deliver them.
However, the interpretation should remain measured. The results suggest an association between chatbot-based training and improvements in emotional expressiveness and vocal modulation, but they do not demonstrate that the chatbot alone caused these changes. Future studies with control groups and external evaluator ratings would be necessary to determine whether these emotional and vocal changes translate into stronger interview evaluations or better recruitment outcomes.

5.3. Implications for Human Resources and Educational Practices

The findings have practical implications for higher education and human resources training. In university contexts, chatbot-based interview training may offer a scalable tool to help students practice professional communication in a structured and low-risk environment. This is especially relevant in courses related to human talent management, employability, career readiness, and professional development, where students need opportunities to rehearse interviews before facing real selection processes.
The value of this approach lies in the combination of repeated practice and immediate feedback. In large educational settings, it is often difficult to provide individualized interview coaching to every student. A calibrated chatbot can complement traditional instruction by offering students a standardized space to practice, receive feedback, and improve their responses over time. This does not replace human mentorship, but it can support it by giving students more opportunities to prepare before interacting with teachers, recruiters, or career advisors.
From a human resources perspective, AI-powered interview simulations may contribute to more consistent preparation conditions. Candidates often differ in their access to professional networks, coaching, and prior interview experience. Chatbot-based training could help reduce part of this gap by providing structured practice and feedback to a broader group of students. However, these tools should be understood as complementary training resources rather than as substitutes for human judgment or professional selection processes.
The results also suggest that interview preparation should not focus exclusively on verbal content. Emotional expressiveness, facial engagement, and vocal modulation are part of how candidates present themselves in evaluative contexts. The observed increases in positive facial expression and vocal happiness, together with reductions in excessive neutrality and some negative indicators, suggest that chatbot-based practice may help students become more aware of how they communicate beyond the literal content of their answers.
For educational institutions, these findings support the integration of AI-assisted simulations into employability programs, provided that they are carefully designed, ethically implemented, and aligned with pedagogical objectives. The calibration of the chatbot is particularly important. In this study, the use of expert-validated questions, a standardized prompt, and consistent feedback criteria helped ensure that the training process was comparable across participants.
For human resources practice, the study points to the potential of AI tools for pre-interview preparation and candidate development. Rather than using AI only as a mechanism for screening or evaluation, organizations and universities can also use it formatively, helping candidates improve their communication before entering formal selection processes. This is especially valuable when the goal is not only to identify talent, but also to prepare students and early-career professionals to participate more confidently and effectively in recruitment contexts.
Overall, the findings suggest that AI-powered chatbot training may be a useful complementary tool for strengthening interview readiness. Its main contribution lies in offering repeated, structured, and feedback-based practice that can be combined with behavioral measurement technologies. Nevertheless, future research should examine whether the changes observed in simulated interviews persist over time and whether they are reflected in evaluations made by human recruiters or external observers.

7. Limitations and Future Studies

This study has several limitations that should be considered when interpreting the findings. First, the research used a one-group pretest-posttest quasi-experimental design without a control group. For this reason, the results cannot be interpreted as definitive evidence that the chatbot training alone caused the observed changes. Other factors, such as repeated exposure to the interview format, greater familiarity with the questions, reduced anxiety after practice, or increased comfort with the laboratory setting, may also have contributed to the pre-post differences.
Second, the sample consisted of 54 students from a single private Latin American university. Although this sample size is suitable for an applied behavioral study with within-participant comparisons, it limits the generalizability of the findings. Future studies should include larger and more diverse samples, ideally from different universities, academic programs, and cultural contexts. This would make it possible to assess whether the observed patterns are consistent across broader student populations.
Third, the study was conducted in a simulated interview setting. This allowed for greater control over the procedure and measurement conditions, but it may not fully reproduce the pressure, unpredictability, and interpersonal complexity of real job interviews. Future research should examine whether chatbot-based training produces similar effects in more realistic selection contexts, including live interviews with recruiters or external evaluators.
Fourth, the study focused on facial and vocal emotional reaction time as the main outcomes. These indicators are relevant for understanding emotional expressiveness and vocal modulation, but they do not capture the full complexity of interview performance. Future studies could include additional measures, such as recruiter ratings, independent evaluator assessments, quality of verbal responses, body language, self-efficacy, interview anxiety, and actual employment outcomes.
Fifth, although eye-tracking was used to contextualize facial expression data, it was not treated as a primary statistical outcome in this analysis. Future research could examine eye-tracking indicators more directly, including fixation duration, gaze stability, visual attention to the interviewer or screen, and patterns of visual engagement during difficult questions. This would provide a more complete understanding of how visual attention relates to emotional regulation during interview simulations.
Sixth, some statistical results were stronger in the Wilcoxon signed-rank test than in the paired t-test. This suggests that certain changes may reflect consistent participant-level shifts rather than large average differences. Future studies should report additional statistical indicators, such as effect sizes, confidence intervals, and corrected p-values for multiple comparisons. These analyses would help clarify the magnitude and robustness of the observed changes.
Finally, the chatbot was calibrated through expert review and standardized prompting, but future work could further validate the chatbot’s scoring system against external human evaluators. Comparing chatbot scores with ratings from experienced recruiters would help determine whether the feedback provided by the chatbot aligns with professional judgment in recruitment and selection contexts.
Future research should therefore move in three directions. First, studies should include control or comparison groups to better isolate the effect of chatbot-based training. Second, longitudinal designs should be used to determine whether changes in emotional expressiveness and vocal modulation persist over time. Third, future studies should examine whether improvements observed in simulated interviews translate into better performance in real recruitment processes.
In summary, the present study provides useful preliminary evidence that AI-powered chatbot training may support students’ emotional and vocal preparation for job interviews. The findings are particularly relevant for higher education institutions seeking scalable tools to strengthen employability skills. Nevertheless, further research is needed to confirm these results, refine the intervention, and evaluate its impact in more diverse and realistic professional contexts.

Authors Contributions

In the research, the study’s conception and design were collaboratively developed by Alberto Grajeda and Juan Pablo Cordova. The responsibilities of data collection, database creation, and data processing and analysis were overseen by Juan Pablo Cordova, Pamela Cordova, Patricia Gasser and Isabel La Fuente. Data interpretation saw contributions from Patricia Gasser, Isabel La Fuente, Alberto Grajeda and María Isabel Pueyo. The initial manuscript draft was penned by Pamela Cordova and Alberto Grajeda. It was subsequently reviewed and refined by Hernan Naranjo and María Isabel Pueyo. Each author contributed to the final manuscript’s review, and all approved it for submission.

Funding

This research received no external funding.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki. Ethical approval/exemption for the research protocol underlying this study was granted by the Ethics Committee in Research at Universidad Privada Boliviana (UPB) on 25 January 2023. The Committee determined that the protocol was exempt from formal ethical evaluation because it involved a minimal-risk and non-invasive educational training activity, voluntary participation, informed consent, and de-identified data analysis. No formal ethical approval/reference number or permit number was assigned in the exemption letter. No separate independent review board approval was obtained, as the institutional ethics review was conducted by the Ethics Committee in Research at Universidad Privada Boliviana (UPB).

Data Availability Statement

The datasets generated and analyzed during the current study are not publicly available due to privacy restrictions but are available from the corresponding author on reasonable request. The data that can be provided will be provided in a de-identified manner.

Acknowledgments

During the preparation of this manuscript, we utilized ChatGPT-4o (OpenAI) to refine the English syntax and semantics. The tool was used exclusively to enhance the clarity, coherence, and linguistic quality of the manuscript, given that the authors are non-native English speakers. All intellectual content, including the ideas and conclusions presented, remains entirely the responsibility of the authors, who thoroughly reviewed and validated the final text to ensure its accuracy and integrity.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Albassam, W. A. The power of artificial intelligence in recruitment: An analytical review of current AI-Based recruitment strategies. Int. J. Prof. Bus. Rev. 2023, 8(6), 1–25. [Google Scholar] [CrossRef]
  2. Balconi, M.; Cassioli, F. “We will be in touch”. A neuroscientific assessment of remote vs. face-to-face job interviews via EEG hyperscanning. Soc. Neurosci. 2022, 17(3), 209–224. [Google Scholar] [CrossRef] [PubMed]
  3. Batrinca, L.; Stratou, G.; Shapiro, A.; Morency, L. P.; Scherer, S. Cicero: Towards a multimodal virtual audience platform for public speaking training. In Intelligent virtual agents. IVA 2013; Aylett, R., Krenn, B., Pelachaud, C., Shimodaira, H., Eds.; Lecture notes in computer science (Vol. 8108); Springer, 2013; Vol. 8108. [Google Scholar] [CrossRef]
  4. Bear, H. M.; Purves, D.; Schwartz, M. Eye Tracking and the Neuromarketing Field: Understanding Human Emotions and Engagement; Academic Press, 2020. [Google Scholar]
  5. Boudjani, N.; Colas, V.; Joubert, C.; Amor, D. B. AI chatbot for job interview. In 2023 46th MIPRO ICT and Electronics Convention (MIPRO); 2023; pp. 1155–1160. [Google Scholar] [CrossRef]
  6. Brunet, L.; Müller, R. The Feeling Rules of Peer Review: Defining, Displaying, and Managing Emotions in Evaluation for Research Funding. Minerva 2024, 62, 167–192. [Google Scholar] [CrossRef]
  7. Callejas, Z.; Ravenet, B.; Ochs, M.; Pelachaud, C. A model to generate adaptive multimodal job interviews with a virtual recruiter. In Proceedings of the European Conference on Computer Vision. University of Granada, CITIC-UGR, Granada, Spain. 2014. [Google Scholar]
  8. Chamorro-Premuzic, T. The Talent Delusion: Why Data, Not Intuition, Is the Key to Unlocking Human Potential; Piatkus Books, 2017. [Google Scholar]
  9. Cohen-Chen, S.; Brady, G. L.; Massaro, S.; van Kleef, G. A. Meh, whatever: The effects of indifference expressions on cooperation in social conflict. J. Personal. Soc. Psychol. 2022, 123(6), 1336–1361. [Google Scholar] [CrossRef]
  10. Dixit, S.; Sharma, N.; Maurya, M.; Dharwal, M. AI Power: Making Recruitment Smarter. In Evolution of Digitized Societies Through Advanced Technologies; Iniciales y Apellidos de los Editores, Ed.; Springer, 2022; pp. 165–180. [Google Scholar] [CrossRef]
  11. Ekman, P. Emotions revealed. BMJ 2004, 328 (Suppl S5), 0405184. [Google Scholar] [CrossRef]
  12. Goleman, D. Emotional Intelligence: Why It Can Matter More Than IQ; Bantam Books, 1995. [Google Scholar]
  13. Grandey, A. A. When “The Show Must Go On”: Surface Acting and Deep Acting as Determinants of Emotional Exhaustion and Peer-Rated Service Delivery. Acad. Manag. J. 2003, 46(1), 86–96. [Google Scholar] [CrossRef]
  14. Gross, J. J. Emotion regulation: Affective, cognitive, and social consequences. Psychophysiology 2002, 39, 281–291. [Google Scholar] [CrossRef]
  15. Harrison, J. A.; Halinski, M.; Manroop, L. Happy, and they know it? The roles of positive affectivity, intrinsic motivation and network building on LinkedIn on employment predictions. Career Dev. Int. 2024, 29(6), 656–673. [Google Scholar] [CrossRef]
  16. Hemamou, L.; Felhi, G.; Vandenbussche, V.; Martin, J.-C.; Clavel, C. HireNet: A Hierarchical Attention Model for the Automatic Analysis of Asynchronous Video Job Interviews. Proc. AAAI Conf. Artif. Intell. 2019, 33(01), 573–581. [Google Scholar] [CrossRef]
  17. Hogan, R.; Hogan, J.; Kaiser, R. B. Management derailment: Personality assessment and mitigation. Personal. Ment. Health 2010, 4(1), 26–40. [Google Scholar]
  18. Howe, J. R. Fear of negative and positive evaluation across social evaluative situations. masters theses; Eastern Illinois University, 2014. Available online: https://thekeep.eiu.edu/theses/1265.
  19. Huang, M. H.; Rust, R. T. Artificial intelligence in service. J. Serv. Res. 2018, 21(2), 155–172. [Google Scholar] [CrossRef]
  20. Jacobs, E.; Broekens, J.; Jonker, C. Joy, Distress, Hope, and Fear in Reinforcement Learning. In Proceedings of the 13th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), Paris, France. 2014. Available online: https://www.proceedings.com/content/023/023344webtoc.pdf.
  21. Kolb, D. A. Experiential Learning: Experience as the Source of Learning and Development; Prentice Hall, 2014. [Google Scholar]
  22. Liu, G.-Y.; Hu, J.-M.; Wang, H.-L. A co-word analysis of digital library field in China. Scientometrics 2012, 91(1), 203–217. [Google Scholar] [CrossRef]
  23. Lievens, F.; De Paepe, A. An empirical investigation of interviewer-related factors that discourage the use of high structure interviews. J. Organ. Behav. 2004, 25, 29–46. [Google Scholar] [CrossRef]
  24. Mendoza, H. E. Nuevos desafíos en la contratación de personal: cómo la evolución del proceso de reclutamiento está transformando el mercado laboral. J. Econ. Soc. Sci. Res. 2021, 1(3), 54–67. [Google Scholar] [CrossRef]
  25. Mocha, V. La importancia de la entrevista como herramienta en el proceso de selección del talento humano. Rev. Dilemas Contemp. Educ. Política Y Valores 2018, 6(44). [Google Scholar]
  26. Nawaz, N.; Gomes, A. M. Artificial Intelligence Chatbots are New Recruiters. Int. J. Adv. Comput. Sci. Appl. 2019, 10(9), 1–8. [Google Scholar] [CrossRef]
  27. Nuzula, I. F.; Amri, M. M. Will ChatGPT bring a New Paradigm to HR World? A Critical Opinion Article. J. Manag. Stud. Dev. 2023, 2(2), 142–161. [Google Scholar] [CrossRef]
  28. Ottilie, T.; Krings, F.; Roulin, N.; Bourdage, J. S.; Fetzer, M. Reactions to asynchronous video interviews: The role of design decisions and applicant age and gender. In Human Resource Management; 2023. [Google Scholar] [CrossRef]
  29. Picard, R. W. Affective Computing; MIT Press, 2000. [Google Scholar]
  30. Piedra Mayorga, V. M.; Granillo Macias, R.; Vázquez Alamilla, M. A.; Rodriguez Moreno, R. Procedimiento para el reclutamiento, selección e inducción del personal: perspectivas y tendencias. Ingenio Y Conciencia Boletín Científico De La Escuela Superior Ciudad Sahagún 2023, 10(19), 61–69. [Google Scholar] [CrossRef]
  31. Roulin, N.; Bourdage, J. S.; Wingate, T. G. Who Is Conducting ‘Better’ Employment Interviews? Antecedents of Structured Interview Components Use. Pers. Assess. Decis. 2019, 5(1), 37–48. [Google Scholar] [CrossRef]
  32. Sardi, B.; Troilo, F. Entrevistas de selección de personal mediadas por tecnología: La perspectiva de selectores (Documento de Trabajo N.° 737); Universidad del CEMA, 2020; Available online: https://www.econstor.eu/handle/10419/238362.
  33. Shen, D. How do emotions like trust and fear shape East-Asian security dynamics. J. Educ. Humanit. Soc. Sci. 2023, 18, 183–185. [Google Scholar] [CrossRef]
  34. Singh, A. P.; Saxena, R.; Saxena, S. The human touch in the age of Artificial Intelligence: A Literature review on the interplay of emotional intelligence and AI. Asian J. Curr. Res. 2024, 9(4), Article 4. [Google Scholar] [CrossRef]
  35. Stephens, K. K.; Nader, K.; Hughes, A. L.; Harris, A. G.; Montagnolo, C.; Stevens, A.; Senarath Wijesuriya, Y. P.; Purohit, H. Online-computer-mediated interviews and observations: Overcoming challenges and establishing best practices in a human-AI teaming context. In Proceedings of the 54th Hawaii International Conference on System Sciences; 2021; pp. 2896–2905. [Google Scholar] [CrossRef]
  36. Upadhyay, A. K.; Khandelwal, K. Applying artificial intelligence: implications for recruitment. Strateg. HR Rev. 2018, 17(5), 255–258. [Google Scholar] [CrossRef]
  37. Van Doorn, E. A.; Van Kleef, G. A.; Van der Pligt, J. How emotional expressions shape prosocial behavior: Interpersonal effects of anger and disappointment on compliance with requests. Motiv. Emot. 2015, 39, 128–141. [Google Scholar] [CrossRef]
  38. Xiao, Z.; Zhou, M. X.; Chen, W.; Yang, H.; Chi, T. If I Hear You Correctly: Building and Evaluating Interview Chatbots with Active Listening Skills. In CHI 2020 Conference on Human Factors in Computing Systems, Honolulu, HI, USA. ACM. 2020. [Google Scholar] [CrossRef]
  39. Xu, W. Research on the Correlation Between Fear of Negative Evaluation and Perfectionism Among College Students. In Proceedings of the International Conference on Global Politics and Socio-Humanities; 2023. [Google Scholar] [CrossRef]
  40. Zhang, S.; Chen, L.; Zhang, L.; Stein, A. M. The ripple effect: How leader workplace anxiety shape follower job performance. Front. Psychol. 2022, 13, 965365. [Google Scholar] [CrossRef] [PubMed]
  41. Zhou, M. X.; Chen, W.; Xiao, Z.; Yang, H.; Chi, T.; Williams, R. Getting Virtually Personal: Chatbots Who Actively Listen to You and Infer Your Personality. In 24th International Conference on Intelligent User Interfaces (IUI ‘19 Companion), Marina Del Rey, CA, USA. ACM. 2019. [Google Scholar] [CrossRef]
Table 1. Chatbot calibration process.
Table 1. Chatbot calibration process.
Calibration stage Description Purpose
Expert review of interview questions Ten experts reviewed the five core interview questions Ensure professional relevance, clarity, and realism
Validation of evaluation criteria Experts reviewed the scoring criteria: clarity, coherence, conciseness, vocabulary, emotional tone, and adequacy Align chatbot feedback with HR interview standards
Trial interactions Pilot interactions were conducted to review chatbot responses and feedback quality Identify inconsistencies in scoring or recommendations
Prompt adjustment The base prompt was refined to standardize tone, structure, and evaluation logic Ensure consistency across participants
Final calibration The final prompt and question structure were fixed before the intervention Preserve comparability in the training process
Table 2. Intervention sequence.
Table 2. Intervention sequence.
Phase Activity Measurement Purpose
Pre-training assessment Baseline simulated interview Facial expression recognition, eye-tracking, and vocal tone analysis Establish initial facial and vocal emotional indicators
Training session 1 Chatbot-based interview practice with feedback Chatbot score and qualitative feedback Introduce structured feedback and identify areas for improvement
Training session 2 Repeated chatbot-based practice Chatbot score and qualitative feedback Apply previous recommendations and refine answers
Training session 3 Repeated chatbot-based practice Chatbot score and qualitative feedback Reinforce learning and consolidate response strategies
Post-training assessment Final simulated interview Facial expression recognition, eye-tracking, and vocal tone analysis Compare post-training facial and vocal emotional indicators with baseline
Table 3. Laboratory technologies and analytical role.
Table 3. Laboratory technologies and analytical role.
Technology Data collected Role in the study Main use in the analysis
iMotions-supported facial expression recognition Facial emotional reaction time Detection of facial emotional patterns during simulated interviews Primary outcome for facial emotional indicators
Eye-tracking Visual attention and engagement indicators Verification and contextualization of participants’ visual engagement during the task Complementary support for interpreting facial expression data
Vocal tone analysis Emotional vocal reaction time Detection of emotional tone in participants’ speech Primary outcome for vocal emotional indicators
AI-powered chatbot Interview responses, scores, and qualitative feedback Training tool for structured interview practice Intervention mechanism, not the main outcome variable
Table 4. Variables and measures.
Table 4. Variables and measures.
Dimension Indicator type Variables Measurement unit Interpretation
Facial expression Aggregate indicators Positive, negative, neutral, confusion, sentimentality Percentage of interview time Proportion of time in which each facial category was detected
Facial expression Disaggregated emotions Joy, surprise, anger, sadness, disgust, fear, contempt Percentage of interview time Proportion of time associated with specific facial emotions
Vocal expression Vocal emotional indicators Happiness, sadness, anger, neutrality Percentage of speaking/interview time Proportion of time associated with specific vocal emotional tone
Chatbot interaction Formative training indicators Individual response scores and overall session score 1–10 scale Used for feedback during training, not as the main outcome variable
Eye-tracking Contextual indicators Visual attention and engagement Gaze/fixation-related indicators Used to contextualize facial expression analysis
Table 5. Statistical analysis plan.
Table 5. Statistical analysis plan.
Analysis stage Procedure Purpose
Descriptive analysis Pre-training and post-training means Describe direction and magnitude of change
Parametric paired comparison Paired t-test Assess mean pre-post differences
Non-parametric paired comparison Wilcoxon signed-rank test Assess participant-level pre-post changes when normality may not hold
Interpretation of convergence Comparison of t-test and Wilcoxon results Distinguish stronger evidence from more cautious findings
Reporting Test statistics, p-values, and significance indicators Present results transparently and consistently
Table 6. Statistical significance tests of emotional facial reaction time in simulated job interviews comparing pre- and post-AI-powered chatbot training.
Table 6. Statistical significance tests of emotional facial reaction time in simulated job interviews comparing pre- and post-AI-powered chatbot training.
Facial expressions Mean t-test Wilcoxon test Diff.
Aggregate Pre-training Mean Post-training Mean t-test t-test p-value Wilcoxon signed-rank Z Wilcoxon p-value
Proportion of Positive Emotional Facial Reaction Time (%) 3.52 14.75 -3.16*** <0.001 -3.05*** 0.002 Yes
Proportion of Negative Emotional Facial Reaction Time (%) 1.82 0.70 1.49* 0.070 2.58*** 0.010 Yes
Proportion of Confusion Emotional Facial Reaction Time (%) 1.07 0.70 0.54 0.600 2.42** 0.016 Yes
Proportion of Sentimentality Emotional Facial Reaction Time (%) 1.09 2.18 -1.64** 0.050 -2.01** 0.044 Yes
Proportion of Neutral Emotional Facial Reaction Time (%) 83.17 70.18 1.70** 0.050 2.08** 0.038 Yes
Disaggregate
Proportion of Joy Emotional Facial Reaction Time (%) 2.38 10.10 -2.95*** <0.001 -2.93*** 0.003 Yes
Proportion of Surprise Emotional Facial Reaction Time (%) 7.48 7.45 0.01 0.500 0.12 0.904 No
Proportion of Anger Emotional Facial Reaction Time (%) 0.66 0.84 -0.47 0.640 -0.62 0.535 No
Proportion of Sadness Emotional Facial Reaction Time (%) 0.48 0.30 0.71 0.470 2.67*** 0.008 Yes
Proportion of Disgust Emotional Facial Reaction Time (%) 0.45 0.06 1.31* 0.100 2.19** 0.029 Yes
Proportion of Fear Emotional Facial Reaction Time (%) 6.61 6.11 0.16 0.870 1.37 0.171 No
Proportion of Contempt Emotional Facial Reaction Time (%) 3.34 0.62 1.76** 0.040 2.11** 0.035 Yes
Note: ***p<0.01 (significant at 99% confidence); ** p<0.05 (significant at 95% confidence); * p<0.1 (significant at 90% confidence).
Table 7. Statistical significance tests of emotional vocal reaction time in simulated job interviews comparing pre- and post-AI-powered chatbot training.
Table 7. Statistical significance tests of emotional vocal reaction time in simulated job interviews comparing pre- and post-AI-powered chatbot training.
Vocal expressions Pre-training Mean Post-training Mean t-test t-test p-value Wilcoxon signed-rank Z Wilcoxon p-value Diff.
Proportion of Happiness Emotional Vocal Reaction Time (%) 2.79 10.71 -3.49*** 0.00 -3.21*** 0.001 Yes
Proportion of Sadness Emotional Vocal Reaction Time (%) 13.84 12.71 -0.24 0.40 1.97** 0.049 Yes
Proportion of Anger Emotional Vocal Reaction Time (%) 1.64 1.41 0.54 0.60 2.46** 0.014 Yes
Proportion of Neutrality Emotional Vocal Reaction Time (%) 66.67 62.72 -1.67* 0.10 1.97** 0.049 Yes
Note: ***p<0.01 (significant at 99% confidence); ** p<0.05 (significant at 95% confidence); * p<0.1 (significant at 90% confidence).
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated