Search | Preprints.org

Preprint ARTICLE | doi:10.20944/preprints202403.1140.v1

CheapSE: Improving Magnitude-Based Speech Enhancement Using Self-Reference

Benzhe Dai, Kaijun Tan, Huidong Xue, Huaxiang Lu

Subject: Computer Science And Mathematics, Security Systems Keywords: GRU; Self-Reference; speech enhancement

Online: 19 March 2024 (11:20:26 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201910.0376.v1

Evaluation of Mixed Deep Neural Networks for Reverberant Speech Enhancement

Michelle Gutiérrez-Muñoz, Astryd González-Salazar, Marvin Coto-Jiménez

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: artificial neural network; deep learning; LSTM; speech processing

Online: 31 October 2019 (16:40:30 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202303.0158.v1

Characterization of Deep-Learning-Based Speech Enhancement Techniques in Online Audio Processing Applications

Caleb Rascon

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: speech enhancement; online applicability; real-time factor

Online: 8 March 2023 (15:25:56 CET)

Show abstract| Download PDF| Supplementary Files| Share

Preprint ARTICLE | doi:10.20944/preprints202302.0465.v1

CGA-MGAN: Metric GAN based on Convolution-augmented Gated Attention for Speech Enhancement

Haozhe Chen, Xiaojuan Zhang

Subject: Computer Science And Mathematics, Information Systems Keywords: CGA-MGAN; Gated Attention Unit; Speech Enhancement

Online: 27 February 2023 (09:24:31 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202103.0221.v1

Robustness and Sensitivity Tuning of the Kalman Filter for Speech Enhancement

Sujan Kumar Roy, Kuldip K. Paliwal

Subject: Engineering, Electrical And Electronic Engineering Keywords: Speech enhancement; Kalman filter; Kalman gain; robustness metric; sensitivity metric; LPC, whitening filter; real-life noise

Online: 8 March 2021 (13:39:44 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202004.0001.v1

Power Spectral Analysis upon Disfluent Utterance in Adults Who Stutter; a qEEG-based Investigation

Masoumeh Bayat, Reza Boostani, Malihe Sabeti, Fariba Yadegari, Mohammadreza Pirmoradi, Rao KS, Mohammad Nami

Subject: Medicine And Pharmacology, Neuroscience And Neurology Keywords: stuttering; power spectra; speech preparation; imagined speech; simulated speech

Online: 1 April 2020 (07:52:09 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202112.0196.v1

Assessment of Speech Quality during Speech Rehabilitation Based on the Solution of the Classification Problem

Evgeny Kostyuchenko, Ivan Rakhmanenko, Alexander Shelupanov, Lidiya Balatskaya, Ivan Sidorov

Subject: Computer Science And Mathematics, Information Systems Keywords: Speech Rehabilitation; Speech Quality Assessment; LSTM

Online: 13 December 2021 (10:10:36 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202005.0383.v1

Order Effects in the Perception and Production of New Words

Peter Richtsmeier, Michelle Moore

Subject: Social Sciences, Psychology Keywords: child speech; speech production; speech perception; learning; consonant age of acquisition

Online: 24 May 2020 (16:07:44 CEST)

Show abstract| Download PDF| Share

Purpose: Perceptual learning and production practice are basic mechanisms that children depend on to acquire adult levels of speech accuracy. In this study, we examined perceptual learning and production practice as they contributed to changes in speech accuracy in three- and four-year-old children. Our primary focus was manipulating the order of perceptual learning and baseline production practice to better understand when and how these learning mechanisms interact. Method: Sixty-five typically-developing children between the ages of three and four were included in the study. Children were asked to produce CVCCVC nonwords like /bozjəm/ and /tʌvtʃəp/ that were described as the names of make-believe animals. All children completed two separate experimental blocks: a baseline block in which participants heard each nonword once and repeated it, and a test block in which the perceptual input frequency of each nonword varied between 1 and 10. Half of the participants completed a baseline-test order; half completed a test-baseline order. Results: Greater accuracy was observed for nonwords produced in the second experimental block, reflecting a production practice effect. Perceptual learning resulted in greater accuracy during the test for nonwords that participants heard 3 or more times. However, perceptual learning did not carry over to baseline productions in the test-baseline design, suggesting that it reflects a kind of temporary priming. Finally, a post hoc analysis suggested that the size of the production practice effect depended on the age of acquisition of the consonants that comprised the nonwords. Conclusions: The study provides new details about how perceptual learning and production practice interact with each other and with phonological aspects of the nonwords, resulting in complex effects on speech accuracy and learning of form-referent pairs. These findings may ultimately help speech-language pathologists maximize their clients’ improvement in therapy.

Preprint ARTICLE | doi:10.20944/preprints202306.0223.v1

Enhancing Voice Cloning Quality through Data Selection and Alignment-based Metrics

Ander González Docasal, Aitor Álvarez

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Voice Cloning; Speech Synthesis; Speech Quality Evaluation

Online: 5 June 2023 (02:27:49 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201807.0106.v1

Auditory-Visual Speech Perception in Bipolar Disorder: A Preliminary Study

Arzu Yordamlı, Doğu Erdener

Subject: Social Sciences, Cognitive Science Keywords: auditory-visual speech perception; bipolar disorder; speech perception

Online: 6 July 2018 (05:21:19 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202210.0480.v1

Bilingual Asr Model With Language Identification for Brazilian Portuguese and South-American Spanish

Felipe Farias, Wilmer Lobato, William Cruz, Marcellus Amadeus

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Speech Recognition; Automatic Speech Recognition; Language Identification; Wav2Vec2; Multilingual

Online: 31 October 2022 (10:06:34 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202309.0497.v1

Hate Speech Detection in the Arabic Language: Corpus Design, Construction and Evaluation

Ashraf Ahmad, Mohammad Azzeh, Eman Alnagi, Qasem Abu Al-Haija, Dana Halabi, Abdullah Aref, Yousef AbuHour

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Arabic Hate Speech; Natural Language Processing (NLP); Machine Learning; Arabic 18 Hate Speech Detection; Arabic Hate Speech Corpus

Online: 7 September 2023 (07:14:15 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202104.0651.v1

Speech Emotion Recognition using Data Augmentation Method by Cycle-Generative Adversarial Networks

Arash Shilandari, Hossein Marvi, Hossein Khosravi

Subject: Engineering, Electrical And Electronic Engineering Keywords: speech processing, data augmentation, speech emotion recognition, generative adversarial net-works

Online: 26 April 2021 (10:49:55 CEST)

Show abstract| Download PDF| Share

Nowadays, and with the mechanization of life, speech processing has become so crucial for the interaction between humans and machines. Deep neural networks require a database with enough data for training. The more features are extracted from the speech signal, the more samples are needed to train these networks. Adequate training of these networks can be ensured when there is access to sufficient and varied data in each class. If there is not enough data; it is possible to use data augmentation methods to obtain a database with enough samples. One of the obstacles to developing speech emotion recognition systems is the Data sparsity problem in each class for neural network training. The current study has focused on making a cycle generative adversarial network for data augmentation in a system for speech emotion recognition. For each of the five emotions employed, an adversarial generating network is designed to generate data that is very similar to the main data in that class, as well as differentiate the emotions of the other classes. These networks are taught in an adversarial way to produce feature vectors like each class in the space of the main feature, and then they add to the training sets existing in the database to train the classifier network. Instead of using the common cross-entropy error to train generative adversarial networks and to remove the vanishing gradient problem, Wasserstein Divergence has been used to produce high-quality artificial samples. The suggested network has been tested to be applied for speech emotion recognition using EMODB as training, testing, and evaluating sets, and the quality of artificial data evaluated using two Support Vector Machine (SVM) and Deep Neural Network (DNN) classifiers. Moreover, it has been revealed that extracting and reproducing high-level features from acoustic features, speech emotion recognition with separating five primary emotions has been done with acceptable accuracy.

Preprint ARTICLE | doi:10.20944/preprints202103.0513.v1

Automatic Voice Query Service for Multi-Accented Mandarin Speech

Kejing Xiao, Zhaopeng Qian

Subject: Engineering, Automotive Engineering Keywords: Automatic Voice Query Service; Automatic Speech Recognition; Multi-Accented Mandarin Speech Recognition

Online: 22 March 2021 (10:55:53 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202312.1284.v1

Cochlear implant benefits are relevant and stable over time for adult patients with Single Sided Deafness

Domenico Cuda, Erica Pizzol, Andrea Laborai, Daria Salsi, Sara Ghiselli

Subject: Medicine And Pharmacology, Otolaryngology Keywords: single sided deafness; localization; SSQ; THI; speech intellegibility in quiet; speech intellegibility in noise

Online: 18 December 2023 (13:24:46 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202301.0008.v1

Exploring Prosodic Features Modelling for Secondary Emotions Needed for Empathetic Speech Synthesis

Jesin James, Balamurali B.T, Catherine Watson, Hansjörg Mixdorff

Subject: Engineering, Electrical And Electronic Engineering Keywords: Secondary emotions; emotional speech synthesis; fundamental frequency contour; Fujisaki model; low-resource; empathetic speech

Online: 3 January 2023 (07:29:37 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202304.0575.v3

Imagined Speech Classification Using EEG and Deep Learning

Mokhles M. Abdulghani, Wilbur L. Walters, And Khalid H. Abed

Subject: Engineering, Bioengineering Keywords: Inner Speech; Imagined Speech; EEG Decoding; Brain-Computer Interface; BCI; LSTM; Wavelet Scattering Transformation; WST.

Online: 15 May 2023 (05:43:54 CEST)

Show abstract| Download PDF| Supplementary Files| Share

Preprint BRIEF REPORT | doi:10.20944/preprints202310.0690.v2

Precision Location Keyword Detection Using Offline Speech Recognition Technique

Mohsin Imam, Gaurav Gupta

Subject: Computer Science And Mathematics, Other Keywords: Keyword Detection; Audio Models; Speech Processing

Online: 7 November 2023 (02:34:57 CET)

Show abstract| Download PDF| Share

This study introduces an original comprehensive system centered on identifying specific terms that indicate a user's position, particularly the discrete values representing latitude and longitude. This system not only detects these terms but also retrieves the corresponding numerical data for accurate and efficient determination of locations. The importance of this study can be applied various fields, notably aiding offline operations of military personnel, who often lack internet access. In such scenarios, precise awareness of location is vital for strategic manoeuvres, rescue operations, and navigating unfamiliar landscapes. The system allows these personnel by allowing them to extract exact location coordinates from spoken terms, thereby enhancing their awareness even in challenging surroundings. Apart from its military utility, the project holds broader significance. Teams responding to emergencies, personnel involved in disaster management, and exploratory missions can all gain from this technology during disruptions in communication infrastructure. Furthermore, travelers, adventurers, and outdoor enthusiasts can utilize this system to accurately determine their positions in remote areas without relying on online maps. We used offline speech recognition techniques to precisely transcribe spoken terms, achieving an accuracy of over 91.3% and a word error rate of 4.2%. For sound recognition, the OpenAI Whisper model was used, and a conversion process from SpeechRecognition to AudioSegmentation was implemented, followed by transforming the audio into .wav format, we have also developed the interface of the app to use it efficiently using Streamlit. This was done to ensure seamless compatibility with the Whisper model and uninterrupted audio input. By training the system to identify specific linguistic linked to location, it achieves robust detection and extraction of relevant terms. This approach eliminates the necessity for constant internet connectivity, rendering it exceptionally useful in remote, offline, and resource-limited situations.

Preprint ARTICLE | doi:10.20944/preprints202310.0722.v1

A Neural Network Architecture for Children’s Audio-Visual Emotion Recognition

Anton Matveev, Yuri Matveev, Olga Frolova, Aleksandr Nikolaev, Elena Lyakso

Subject: Computer Science And Mathematics, Computer Science Keywords: Audio-visual speech; emotion recognition; children

Online: 13 October 2023 (07:11:19 CEST)

Show abstract| Download PDF| Share

Preprint CONCEPT PAPER | doi:10.20944/preprints202108.0194.v1

Communicative Congruence and Communicative Dysphoria: A Theory of Communication, Personality, and Identity

Brett Welch, Leah Helou

Subject: Social Sciences, Sociology Keywords: congruence; voice; speech; communication; identity; personality

Online: 9 August 2021 (12:41:06 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202011.0646.v1

Online Multilingual Hate Speech Detection: Experimenting with Hindi and English Social Media

Neeraj Vashistha, Arkaitz Zubiaga

Subject: Computer Science And Mathematics, Computer Science Keywords: social media; hate speech; text classification

Online: 25 November 2020 (14:12:07 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202404.0611.v1

A method for automatically detecting errors in an embedded English speech teaching system

Soumya Majumdar, Karan Das, Omar Abdullah Chowdhury, Ashutosh Prasad Yadav, Bapun Sahoo

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Teaching System, Error Detection, Automated detectors, Online Learning, Spoken English, Speech Teaching System, Artificial Intelligence, Speech Recognition

Online: 9 April 2024 (09:40:31 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202301.0580.v1

Spyware Integrated with Prediction Models for Monitoring Corporate Computers

Darlan Noetzold, Anubis Graciela De Moraes Rossetto, Valderi Reis Quietinho Leithardt

Subject: Computer Science And Mathematics, Information Systems Keywords: Electronic monitoring; hate speech; data leakage; prediction

Online: 31 January 2023 (08:59:39 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202211.0017.v1

Performance Comparison of TTS Models for Brazilian Portuguese to Establish a Baseline

Wilmer Lobato, Felipe Farias, William Cruz, Marcellus Amadeus

Subject: Computer Science And Mathematics, Computer Science Keywords: text-to-speech; naturalness; intelligibility; Brazilian Portuguese

Online: 1 November 2022 (04:37:04 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202102.0156.v1

An Critical Analysis of Speech Recognition of Tamil and Malay Language Through Artificial Neural Network

Kingston Pal Thamburaj, Kartheges Ponniah, Ilangkumaran Sivanathan, Muniisvaran Kumar

Subject: Social Sciences, Anthropology Keywords: ANN; NN; Speech Recognition; interaction; hybrid method

Online: 5 February 2021 (10:58:40 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201712.0058.v1

Interactive Hesitation Synthesis and Its Evaluation

Simon Betz, Birte Carlmeyer, Petra Wagner, Britta Wrede

Subject: Social Sciences, Language And Linguistics Keywords: speech synthesis; evaluation; hesitation; virtual agents; interaction

Online: 11 December 2017 (07:03:14 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202404.0239.v1

Inner Speech Recognition for Mutism and Speech Disorder Using Brain-Computer Interface

Mokhles M. Abdulghani, Wilbur L. Walters, Khalid H. Abed

Subject: Public Health And Healthcare, Physical Therapy, Sports Therapy And Rehabilitation Keywords: Inner Speech; Brain-Computer Interface; Imagined Speech; Support Vector Machine; SVM; Autoregressive Model; AR; Wavelet Variance; Shannon Entropy

Online: 3 April 2024 (11:15:48 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202311.1851.v1

Analyzing the Influence of Diverse Background Noises on Voice Transmission: A Deep Learning Approach to Noise Suppression

Alberto Nogales, Javier Caracuel Cayuela, Álvaro J. García-Tejedor

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Speech enhancement; Noise suppression; Deep learning; Variational autoencoders

Online: 29 November 2023 (06:25:59 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202310.1967.v1

Influence of Shortened Tongue Frenulum on Tongue Mobility, Speech and Occlusion

Aldona Dydyk, Marta Milona, Joanna Janiszewska-Olszowska, Marzena Wyganowska, Katarzyna Grocholewicz

Subject: Medicine And Pharmacology, Dentistry And Oral Surgery Keywords: tongue frenulum; ankyloglossia; swallowing; tongue mobility; speech; occlusion

Online: 31 October 2023 (07:59:09 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202306.1186.v1

Sustainable Health Education Simulator Using Open-Source Technology

Patricia Oyarzún-Diaz, Ana Orellana, Hugo Segura, Cristian Vidal-Silva, Aurora Sánchez, Jorge Serrano

Subject: Engineering, Bioengineering Keywords: SAEF; audiology competencies; audiometry simulation; speech language; students.

Online: 16 June 2023 (07:39:25 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202211.0047.v1

Overcoming the Language Barrier in Hearing-Speech Rehabilitation Using Multilingual Conversational Applications

Wiebke Rötz, Theda Eichler, Rayoung Kim, Holger Sudhoff, Ingo Todt

Subject: Medicine And Pharmacology, Otolaryngology Keywords: hearing therpy; speech therapy; cochlea implant; digital application

Online: 2 November 2022 (06:10:30 CET)

Show abstract| Download PDF| Share

Working Paper ARTICLE

Multimodal Hate Speech Detection in Greek Social Media

Konstantinos Perifanos, Dionysis Goutsos

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Multimodal Machine Learning; Deep Learning; Hate Speech Detection

Online: 15 March 2021 (13:46:27 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202010.0342.v1

Youth Exposure to Hate in the Online Space: An Exploratory Analysis

Nigel Harriman, Neil Shortland, Max Su, Tyler Cote, Marcia A. Testa, Elena Savoia

Subject: Social Sciences, Safety Research Keywords: online hate; hate speech; online disinhibition; online safety

Online: 16 October 2020 (08:27:29 CEST)

Show abstract| Download PDF| Supplementary Files| Share

Preprint ARTICLE | doi:10.20944/preprints201911.0346.v1

Speech Intelligibility During Clinical and Low Frequency Subthalamic Nucleus Stimulation in Parkinson’s Disease

John J. Sidtis, Diana Van Lancker Sidtis, Ritesh Ramdhani, Michele Tagliati

Subject: Medicine And Pharmacology, Neuroscience And Neurology Keywords: speech; Parkinson’s disease; deep brain stimulation; voice; articulation

Online: 28 November 2019 (02:57:03 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201910.0231.v1

Solar Powered Automatic Pattern Design Grass Cutting Robot System Using Arduino

Dost Muhammad Khan, Zain Mumtaz, Majid Saleem, Zeeshan Ilyas, Qinglu Ma, Sahrish Ghaffar, Saleem ullah

Subject: Computer Science And Mathematics, Robotics Keywords: Android; arduino; bluetooth; grass cutter; sensors; speech recognition

Online: 20 October 2019 (02:03:44 CEST)

Show abstract| Download PDF| Supplementary Files| Share

We present an Arduino-based automatic robotic system which is used for cutting grass or lawns, mostly healthy grass which needs to cut neatly like in a public park or a private garden. The purpose of this proposed project is to design a programmable automatic pattern design grass cutting robot with solar power which no longer requires time-consuming manual grass-cutting, and that can be operated wirelessly using an Android Smartphone via Bluetooth from a safe distance which is capable of cutting the grass in indeed required shapes and patterns; the cutting blade can also be adjusted to maintain the different length of the grass. The main focus was to design a prototype that can work with a little or no Physical user interaction. The proposed work is accomplished by using an Arduino microcontroller, DC geared Motors, IR obstacle detection sensor, motor shield, relay module, DC battery, solar panel, and Bluetooth module. The grass-cutting robot system can be moved to the location in the lawn remotely where the user wants to cut the grass directly or in desired patterns. The user can press the desired pattern button from the mobile application, and the system will start cutting grass in the similar design such as a circle, spiral, rectangle, and continue pattern. Also, with the assistance of sensors positioned at the front of the vehicle, an automatic barrier detection system is introduced to enhance safety measurements to prevent any risks. IR obstacle detector sensors are used to detect obstacles, if any obstacle is found in front of the robot while traveling; it avoids the barrier by taking a right/right turn or stop automatically appropriately, thereby preventing the collision. Also, the main aim of this project is the formation of a grass cutter that relieves the user from mowing their own grasses and reduces environmental and noise pollution. The proposed system is designed as a lab-scale prototype to experimentally validate the efﬁciency, accuracy, and affordability of the systems. The experimental results prove that the proposed work has all in one capability (Simple and Pattern based grass cutting with mobile-application, obstacle detection), is very easy to use, and can be easily assembled in a simple hardware circuit. We note that the systems proposed can be implemented on a large scale under real conditions in the future, which will be useful in robotics applications and cutting grass in playing grounds such as cricket, football, and hockey, etc.

Preprint ARTICLE | doi:10.20944/preprints202305.1060.v1

Speech Acts and Language Function’s Representation in EFL High School English Textbooks Leading Towards Educational Sustainability of Northern Iraq

Aram Sabr Tahr, Behbood Mohammadzadesh, Uma Shankar Singh

Subject: Social Sciences, Education Keywords: EFL; language functions; speech acts; teacher’s perception; textbook evaluation

Online: 15 May 2023 (15:54:12 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202211.0041.v1

Min Xu, Jing SHAO, Boquan Liu, Lan Wang, Hongwei Ding, Yang Zhang

Subject: Social Sciences, Language And Linguistics Keywords: older adults; whispered speech; lexical tone; vowel; duration; intensity

Online: 2 November 2022 (03:53:54 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202210.0424.v1

Neurocognitive Dynamics of Prosodic Salience over Semantics during Explicit and Implicit Processing of Basic Emotions in Spoken Words

Yi Lin, Xinran Fan, Yueqi Chen, Hao Zhang, Fei Chen, Hui Zhang, Hongwei Ding, Yang Zhang

Subject: Social Sciences, Language And Linguistics Keywords: emotional speech processing; communication channel; emotion category; task type

Online: 27 October 2022 (08:04:59 CEST)

Show abstract| Download PDF| Supplementary Files| Share

Preprint ARTICLE | doi:10.20944/preprints202105.0777.v1

A Learning Interaction Between Statistical Learning Experiments

Peter Richtsmeier, Lisa Goffman

Subject: Social Sciences, Psychology Keywords: statistical learning; experiment interaction; phonology; child speech; language acquisition

Online: 31 May 2021 (13:37:09 CEST)

Show abstract| Download PDF| Supplementary Files| Share

Preprint REVIEW | doi:10.20944/preprints202009.0197.v2

Scholarship Suppression: Theoretical Perspectives and Emerging Trends

Sean Stevens, Lee Jussim, Nathan Honeycutt

Subject: Social Sciences, Psychology Keywords: academic freedom; free speech; censorship; free inquiry; thought suppression

Online: 12 October 2020 (10:07:22 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202305.0247.v1

Speech Emotion Recognition Using 1-D CLDNN with Data Augmentation

Shing-Tai Pan, Han-Jui Wu

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Speech emotion recognition; one-dimensional neural network; LSTM; CNN; MFCCs

Online: 4 May 2023 (09:45:11 CEST)

Show abstract| Download PDF| Share

Preprint CASE REPORT | doi:10.20944/preprints202212.0561.v1

Does the Potocki-Lupski Syndrome Convey the Autism Spectrum Disorder Phenotype? Case Report and Scoping Review

Oksana I. Talantseva, Galina V. Portnova, Raisa S. Romanova, Daria A. Martynova, Olga V. Sysoeva, Elena L. Grigorenko

Subject: Social Sciences, Psychology Keywords: Potocki–Lupski syndrome; 17p11.2; PTLS; autism; ASD; EEG; language; speech

Online: 29 December 2022 (13:00:18 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202212.0426.v1

Novel Speech Recognition Systems Applied to Forensics within Child Exploitation: Wav2vec2.0 vs. Whisper

Juan Camilo Vásquez-Correa, Aitor Álvarez

Subject: Engineering, Electrical And Electronic Engineering Keywords: Speech Recognition; Keyword Spotting; Child abuse; Federated Learning; Whisper; Wav2vec2.0

Online: 22 December 2022 (09:27:37 CET)

Show abstract| Download PDF| Share

Preprint DATA DESCRIPTOR | doi:10.20944/preprints202212.0118.v1

Visual Lip Reading Dataset in Turkish

Ali Berkol, Talya Tümer-Sivri, Melike Colak, Nergis Pervan-Akman, Hamit Erdem

Subject: Computer Science And Mathematics, Information Systems Keywords: Lip reading; Visual speech recognition; Turkish dataset; Face parts detection

Online: 7 December 2022 (06:50:33 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202208.0109.v1

Effects of Data Augmentations on Speech Emotion Recognition

Bagus Tris Atmaja, Akira Sasou

Subject: Computer Science And Mathematics, Data Structures, Algorithms And Complexity Keywords: speech emotion recognition; affective computing; data augmentations; wav2vec 2.0; SVM

Online: 4 August 2022 (14:09:21 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202205.0066.v1

Improving N-Best Rescoring in Under-Resourced Code-Switched Speech Recognition Using Pretraining and Data Augmentation

Joshua Miles Jansen van Vüren, Thomas Niesler

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: code-switching; automatic speech recognition; low resource languages; language modelling

Online: 6 May 2022 (09:09:31 CEST)

Show abstract| Download PDF| Share

We present improvements in n-best rescoring of code-switched speech achieved by n-gram augmentation as well as optimised pretraining of long short-term memory (LSTM) language models with larger corpora of out-of-domain monolingual text. In addition, we consider the application of large pretrained transformer-based architectures. Our experimental evaluation is performed on an under-resourced corpus of code-switched speech comprising four bilingual code-switched sub-corpora, each containing a Bantu language (isiZulu, isiXhosa, Sesotho, or Setswana) and English. We find in our experiments that, by combining n-gram augmentation with the optimised pretraining strategy, speech recognition errors are reduced for each individual bilingual pair by 3.51% absolute on average over the four corpora. Importantly, we find that even speech recognition at language boundaries improves by 1.14% even though the additional data is monolingual. Utilising the augmented n-grams for lattice generation, we then contrast these improvements with those achieved after fine-tuning pretrained transformer-based models such as distilled GPT-2 and M-BERT. We find that, even though these language models have not been trained on any of our target languages, they can improve speech recognition performance even in zero-shot settings. After fine-tuning on in-domain data, these large architectures offer further improvements, achieving a 4.45% absolute decrease in overall speech recognition errors and a 3.52% improvement over language boundaries. Finally, a combination of the optimised LSTM and fine-tuned BERT models achieves a further gain of 0.47% absolute on average for three of the four language pairs compared to M-BERT. We conclude that the careful optimisation of the pretraining strategy used for neural network language models can offer worthwhile improvements in speech recognition accuracy even at language switches, and that much larger state-of-the-art architectures such as GPT-2 and M-BERT promise even further gains.

Preprint ARTICLE | doi:10.20944/preprints202203.0333.v1

Multi-Model Learning to Detect Twitter Hate Speech

Dharmaraj Patil, Tareek Pattewar

Subject: Engineering, Control And Systems Engineering Keywords: Hate speech detection; Social media; Machine learning; Multi-model learning

Online: 25 March 2022 (02:10:12 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201805.0274.v1

A Prototype of Speech Interface Baed on Google Cloud Platform to Access a Semantic Website

Jimmy Aurelio Rosales-Huamani, José Luis Castillo-Sequera, Juan Carlos Montalvan-Figueroa, Joseps Andrade-Choque

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: artificial intelligence; semantic web; natural language; Google cloud speech; SPARQL

Online: 21 May 2018 (12:38:00 CEST)

Show abstract| Download PDF| Share

Preprint CASE REPORT | doi:10.20944/preprints202105.0278.v1

Laryngeal Paralysis Recovered Two Years After a Head Trauma by Growth Hormone Treatment and Neurorehabilitation

Joaquín Guerra, Hortensia Lema, Carlos Agra, Pedro Martínez, Jesús Devesa

Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: Growth Hormone; Recurrent nerve injury; Speech therapy; Neurostimulation; Vocal cord paralysis

Online: 13 May 2021 (09:27:48 CEST)

Show abstract| Download PDF| Supplementary Files| Share

Preprint ARTICLE | doi:10.20944/preprints201905.0228.v1

Improving Post-Filtering of Artificial Speech Using Pre-Trained LSTM Neural Networks

Marvin Coto-Jiménez

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Deep learning, LSTM, Machine learning, Post-filtering, Signal processing, Speech Synthesis

Online: 17 May 2019 (16:16:53 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201808.0522.v1

Pace, Emotion, and Language Tonality on Speech-to-song Illusion

Carole Leung, De-Hui Ruth Zhou

Subject: Social Sciences, Cognitive Science Keywords: speech-to-song illusion, auditory illusion, perception, pace, emotion, language tonality

Online: 30 August 2018 (10:37:13 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201802.0096.v2

Application of a Lightweight Encryption Algorithm to a Quantized Speech Image for Secure IoT

Mourad Talbi, Med Salim Bouhalel

Subject: Computer Science And Mathematics, Computer Networks And Communications Keywords: IoT; security; encryption; quantized speech image; SNR; PESQ; histogram; entropy; correlation

Online: 15 February 2018 (19:57:48 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202312.1013.v1

How to Construct and Deliver an Elevator Pitch: A Recipe for the Research Scientist

Leslie A Caromile, Ankita Jha, Jaye Gardiner, Ozlem Dilek, Ryoma Ohi, Lee Ligon

Subject: Social Sciences, Education Keywords: Elevator speech; research scientist; Accelerating Career; Transitions; American Society for Cell Biology

Online: 14 December 2023 (12:11:50 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201907.0305.v1

The Metonymic Readings of the Greek Deictic Adverbs εδώ [Here] and εκεί [There] in Politics: A Cognitive Approach

Efthymia Tsaroucha

Subject: Social Sciences, Language And Linguistics Keywords: politics; political speech; economic crisis; Greece; deictics; space; time; image schemas; metonymicity

Online: 27 July 2019 (00:51:33 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201903.0047.v1

DGR: Deep Gender Recognition of Human Speech

Rami S. Alkhawaldeh

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Gender Recognition; Speech Signal; Deep Learning; Evolutionary Search; PSO search; Wolf Search

Online: 4 March 2019 (13:42:02 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201811.0126.v1

Improvement of Speech/Music Classification for 3GPP EVS Based on LSTM

Sang-Ick Kang, Sangmin Lee

Subject: Engineering, Electrical And Electronic Engineering Keywords: Speech/Music Classification; Enhanced Voice Service, Long Short-Term Memory, Big Data

Online: 5 November 2018 (17:02:36 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201802.0108.v1

Punctuation Generation Inspired Linguistic Features for Mandarin Prosody Generation

Chen-Yu Chiang, Yu-Ping Hung, Han-Yun Yeh, I-Bin Liao, Chen-Ming Pan

Subject: Computer Science And Mathematics, Information Systems Keywords: Mandarin; prosody generation; linguistic feature; break prediction; text-to-speech; punctuation confidence

Online: 16 February 2018 (15:39:58 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202309.1636.v1

Auditory Challenges and Listening Effort in School-Age Children with Autism: Insights from Pupillary Dynamics during Speech in Noise Perception

Suyun Xu, Hua Zhang, Juan Fan, Xiaoming Jiang, Minyue Zhang, Jingjing Guan, Hongwei Ding, Yang Zhang

Subject: Social Sciences, Language And Linguistics Keywords: Autism spectrum conditions; Atypical resource allocation; Listening effort; Pupillometry; Speech-in-noise recognition

Online: 26 September 2023 (03:10:24 CEST)

Show abstract| Download PDF| Share

Preprint REVIEW | doi:10.20944/preprints202309.0505.v1

How Can We Compare CI Systems across Manufacturers? A Scoping Review of Recent Literature

Elinor Tzvi-Minker, Andreas Keck

Subject: Medicine And Pharmacology, Otolaryngology Keywords: cochlear implant; patient-reported outcomes; pure tone average; speech in noise; music perception

Online: 7 September 2023 (11:22:04 CEST)

Show abstract| Download PDF| Supplementary Files| Share

Preprint REVIEW | doi:10.20944/preprints202308.2166.v1

Artificial Intelligence in the Interpretation of Videofluoroscopic Swallowing Studies: Implications and Advances for Speech Pathologists

Anna M Girardi, Elizabeth A Cardell, Stephen P Bird

Subject: Public Health And Healthcare, Primary Health Care Keywords: dysphagia; artificial intelligence; videofluoroscopic swallowing study; deep learning; machine learning; imaging; speech pathology

Online: 31 August 2023 (10:42:28 CEST)

Show abstract| Download PDF| Share

Preprint BRIEF REPORT | doi:10.20944/preprints202207.0062.v1

Post-Laryngectomy Voice Prosthesis Changes by Speech-Language-Pathologists: Preliminary results.

Stephane Hans, Gregoire Vialatte de Pemille, Robin Baudouin, Aude Julien-Laferriere, Florent Couineau, Lise Crevier-Buchman, Marta P Circiu, Jerome Rene Lechien

Subject: Medicine And Pharmacology, Otolaryngology Keywords: Total Laryngectomy; Cancer; Voice; Voice prosthesis; Otolaryngology; Head Neck Surgery; Speech Language Therapists.

Online: 5 July 2022 (05:44:14 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202311.0963.v1

Automated Text Annotation Using Semi-Supervised Approach with Meta Vectorizer and Machine Learning Algorithms for Hate Speech Detection

Shoffan Saifullah, Rafał Dreżewski, Felix Andika Dwiyanto, Agus Sasmito Aribowo, Yuli Fauziah, Nur Heri Cahyana

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Hate Speech Detection; Machine Learning; Sentiment Analysis; Semi-Supervised Learning; Self-Learning; Text Mining

Online: 15 November 2023 (09:58:07 CET)

Show abstract| Download PDF| Share

Text annotation is an essential element of the natural language processing approaches. The manual annotation process performed by humans has several drawbacks, such as subjectivity, slowness, fatigue, and possibly carelessness. In addition, annotators may annotate ambiguous data. So, we developed the concept of automated annotation to get the best annotations using several machine-learning approaches. The proposed approach is based on an ensemble algorithm of meta-learners and meta-vectorizer techniques. The approach employs a semi-supervised learning technique for automated annotation, aimed at detecting hate speech. This involves leveraging various machine learning algorithms, including Support Vector Machine (SVM), Decision Tree (DT), K-Nearest Neighbors (KNN), and Naive Bayes (NB), in conjunction with Word2Vec and TF-IDF text extraction methods. The annotation process is performed using 13,169 Indonesian YouTube comments data. The proposed model used a Stemming approach using data from Sastrawi and also new data of 2,245 words. Semi-supervised learning uses 5%, 10%, and 20% of labeled data as compared to performing labeling based on 80% of the datasets. In semi-supervised learning, the model learns from the labeled data, which provides explicit information, and the unlabeled data, which offers implicit insights. This hybrid approach enables the model to generalize and make informed predictions even when limited labeled data is available, ultimately enhancing its ability to handle real-world scenarios with scarce annotated information. In addition, the proposed method uses a variety of thresholds for matching words labeled with hate speech ranging from 0.6, 0.7, 0.8, and 0.9. The experiment showed that the KNN-Word2ec model has the best accuracy value of 96.9% with a scenario of 5%:80%:0.9. However, several other methods have also accuracy above 90%, such as SVM and DT based on both text extraction methods in several test scenarios.

Preprint ARTICLE | doi:10.20944/preprints202211.0037.v1

Cognitive Factors in Nonnative Phonetic Learning: Impacts of Working Memory and Selective Attention on the Benefits and Costs of Talker Variability

Xiaojuan Zhang, Bing Cheng, Yang Zhang

Subject: Social Sciences, Language And Linguistics Keywords: non-native speech learning; talker variability; phonetically-irrelevant variability; long-term retention; cognitive abilities

Online: 2 November 2022 (03:05:23 CET)

Show abstract| Download PDF| Supplementary Files| Share

Preprint ARTICLE | doi:10.20944/preprints202203.0258.v1

Artificial Intelligence and Online Hate Speech Moderation: A Risky Match?

Natalie Alkiviadou

Subject: Social Sciences, Law Keywords: hate speech; artificial intelligence; social media platforms; content moderation; freedom of expression; non-discrimination

Online: 17 March 2022 (15:26:41 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202112.0134.v1

Gender Identification in a Two-Level Hierarchical Speech Emotion Recognition System for an Italian Social Robot

Antonio Guerrieri, Eleonora Braccili, Federica Sgrò, Giulio Meldolesi

Subject: Computer Science And Mathematics, Robotics Keywords: Human Robot Interaction (HRI); social robot; Speech Emotion Recognition (SER); Gender Recognition, affective states

Online: 8 December 2021 (14:31:07 CET)

Show abstract| Download PDF| Share

Working Paper REVIEW

Hearing Loss and Brain Plasticity: The Hyperactivity Phenomenon

Björn Herrmann, Blake Butler

Subject: Medicine And Pharmacology, Neuroscience And Neurology Keywords: hearing loss; aging; hyperactivity; excitability; loss of inhibition; neurophysiology; auditory perception; neural plasticity; speech processing

Online: 15 April 2021 (13:34:54 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202008.0645.v1

Recognizing More Emotions with Less Data Using Self-supervised Transfer Learning

Jonathan Boigne, Biman Liyanage, Ted Östrem

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Speech Emotion Recognition; Emotion AI; Self-Supervised Learning; Transfer Learning; Low Resource Training; wav2vec

Online: 28 August 2020 (15:05:37 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201811.0163.v1

Speech Recognition and a Cymatics Based Configurable Speech Perception

Saeed MIAN QAISAR

Subject: Engineering, Electrical And Electronic Engineering Keywords: Cymatics, Speech recognition, Mel-Frequency Cepstral Coefficients (MFCC), Dynamic time warping (DTW), Chladni plates

Online: 7 November 2018 (13:42:22 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201810.0739.v1

An Efficient Isolated Speech Recognition Based on the Adaptive Rate Processing and Analysis

Saeed MIAN QAISAR

Subject: Engineering, Electrical And Electronic Engineering Keywords: Event-Driven Processing, Speech recognition, Adaptive Resolution Analysis, Features extraction, Dynamic Time Warping, Classification

Online: 31 October 2018 (08:14:15 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202402.0845.v1

Nollywood: Let’s Go to the Movies!

John E. Ortega, William Chen, Ibrahim Said Ahmad

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Natural Language Processing; Automatic Speech Recognition; Machine Translation; Nigeria; English; United States; Movies; machine learning

Online: 15 February 2024 (12:21:40 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202309.1339.v1

AI Enhancements for Linguistic E-learning System

Jueting Liu, Sicheng Li, Chang Ren, Yibo Lyu, Tingting Xu, Zehua Wang, Wei Chen

Subject: Computer Science And Mathematics, Other Keywords: linguistic E-learning; phonetic transcription; mel frequency cepstrum coefficient; grapheme-to-phoneme; transformer; speech synthesis

Online: 20 September 2023 (09:59:40 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202108.0433.v1

A Novel Heterogeneous Parallel Convolution Bi-LSTM for Speech Emotion Recognition

Huiyun Zhang, Heming Huang, Henry Han

Subject: Computer Science And Mathematics, Computer Science Keywords: Speech emotion recognition; Feature extraction; Heterogeneous parallel network; Spectral features; Prosodic features; Multi-feature fusion

Online: 23 August 2021 (12:16:40 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202106.0296.v1

How Visual Word Decoding and Context-driven Auditory Semantic Integration Contribute to Reading: A Test of Additive vs. Multiplicative Models

Yu Li, Hongbing Xing, Linjun Zhang, Hua Shu, Yang Zhang

Subject: Social Sciences, Psychology Keywords: reading comprehension; speech-in-noise recognition; nature F0 contours; flattened F0 contours; Chinese character decoding

Online: 10 June 2021 (13:36:17 CEST)

Show abstract| Download PDF| Share

Preprint CASE REPORT | doi:10.20944/preprints202004.0443.v1

The clinical outcome of concurrent speech therapy and transcranial direct current stimulation in dysarthria and palilalia following traumatic brain injury: A case study

Masumeh Bayat, Mahshid Tahamtan, Malihe Sabeti, Mohammad Nami

Subject: Medicine And Pharmacology, Neuroscience And Neurology Keywords: traumatic brain injury (TBI); Dysarthria; transcranial direct current stimulation (tDCS); Quantitative Electroencephalography (QEEG); speech therapy

Online: 24 April 2020 (13:56:38 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201901.0029.v1

An Automated Robot-Car Control System with Hand-Gestures and Mobile Application Using Arduino

Saleem Ullah, Zain Mumtaz, Shuo Liu, Mohammad Abubaqr, Athar Mahboob, Hamza Ahmad Madni

Subject: Engineering, Electrical And Electronic Engineering Keywords: Android; arduino; bluetooth; hand-gesture recognition; low cost; open source; sensors; smart cars; speech recognition

Online: 3 January 2019 (14:32:23 CET)

Show abstract| Download PDF| Share

Preprint CASE REPORT | doi:10.20944/preprints201805.0300.v1

Rett Syndrome: Treatment with IGF-1, Melatonin, Blackcurrant Extracts, and Rehabilitation

Jesús Devesa, Olga Devesa, María Carrillo, Nerea Casteleiro, Ana Devesa, David Llorente, Cristina González

Subject: Medicine And Pharmacology, Neuroscience And Neurology Keywords: IGF-1; MT; Blackcurrant extracts; Oxidative stress; Mecp2; Speech therapy; Neurostimulation; cyclic glycine-proline; GPE.

Online: 22 May 2018 (11:25:58 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202303.0517.v1

Hearing Assistive Technology Facilitates Sentence-in-Noise Recognition in Children with Autism Spectrum Disorder

Suyun Xu, Juan Fan, Hua Zhang, Minyue Zhang, Hang Zhao, Xiaoming Jiang, Hongwei Ding, Yang Zhang

Subject: Biology And Life Sciences, Behavioral Sciences Keywords: Autism spectrum disorder; Auditory stream segregation; Hearing assistive technology; Speech-in-noise perception; Tonal language speakers

Online: 30 March 2023 (02:52:15 CEST)

Show abstract| Download PDF| Supplementary Files| Share

Preprint ARTICLE | doi:10.20944/preprints202101.0621.v1

Low-Power Audio Keyword Spotting using Tsetlin Machines

Jie Lei, Tousif Rahman, Rishad Shafik, Alex Yakovlev, Alex Yakovlev, Ole-Christoffer Granmo, Fahim Kawsar, and Mathur

Subject: Engineering, Electrical And Electronic Engineering Keywords: Speech Command; MFCC; Tsetlin Machine; Learning Automata; Pervasive AI; Machine Learning; Artificial Neural Network; Keyword Spotting

Online: 29 January 2021 (13:01:47 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202402.1585.v1

Enhancing Mental Health Support through Artificial Intelligence: Advances in Speech and Text Analysis within Online Therapy Platforms

Mariem Jelassi, Khouloud Matteli, Houssem Ben Khalfallah, Jacques Demongeot

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Conversational AI; Automatic Speech Recognition (ASR); Natural Language Processing (NLP); Online Therapy Platforms; AI in Mental Healthcare

Online: 27 February 2024 (16:45:37 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202309.1202.v1

Speech Emotion Recognition Using Convolutional Neural Networks with Attention Mechanism

Konstantinos Mountzouris, Isidoros Perikos, Ioannis Hatzilygeroudis

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: speech emotion recognition; deep learning; Deep Belief Network; deep neural network; Convolutional Neural Network; LSTM; attention mechanism

Online: 19 September 2023 (08:24:22 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202307.0347.v1

A Comparison of Picture Exchange Communication System (PECS) and Speech- Generating Device as Communication Modes for Children with Autism Spectrum Disorders

Roberta Simeoli, Luigi Iovino, Davide Marocco, Giada Guglielmino, Angelo Rega

Subject: Social Sciences, Psychology Keywords: Augmentative and Alternative Communication; Autism; Picture Exchange Communication; Speech Generating Device; Vocal production; Problem behavior; Communicative behavior

Online: 5 July 2023 (15:38:56 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202404.1456.v1

Combining Transformer, CNN, and LSTM Architectures: A Novel Ensemble Learning Technique That Leverages Multi-acoustic Features for Speech Emotion Recognition in Distance Education Classrooms

Eman Abdulrahman Alkhamali, Arwa Allinjawi, Rehab Bahaaddin Ashari

Subject: Artificial Intelligence And Machine Learning, Computer Science And Mathematics Keywords: Transformer, convolutional neural network, long short-term memory , speech emotion recognition, distance education, real-time, emotional stability, instructors

Online: 22 April 2024 (18:40:06 CEST)

Show abstract| Download PDF| Share

Speech emotion recognition (SER) is a technology that can be applied in distance education to analyze speech patterns and evaluate speakers’ emotional states in real-time. It provides valuable insights and can be used to enhance the learning experience by enabling the assessment of instructors’ emotional stability, a factor that significantly impacts information delivery effectiveness. Students demonstrate different engagement levels during learning activities, and assessing this engagement is an important aspect of controlling the learning process and improving e-learning systems. An important aspect that may influence student engagement is the emotional states of their instructors. Accordingly, this research uses deep learning techniques to create an automated system for recognizing instructors’ emotions in their speech when delivering distance learning. This methodology entails integrating Transformer, convolutional neural network, and long short-term memory architectures into an ensemble to enhance SER. Feature extraction from audio data used Mel-frequency cepstral coefficients, chroma, Mel spectrogram, zero crossing rate, spectral contrast, centroid, bandwidth, roll-off, and root-mean square, with subsequent optimization processes adding noise to, conducting time stretching, and shifting the audio data. Notably, several Transformer blocks were incorporated, and a multi-head self-attention mechanism was employed to identify the relationships between the input sequence segments. The pre-processing and data augmentation methodologies significantly enhanced the precision of the results in that the model achieved accuracy rates of 96.3%, 99.86%, 96.5%, and 85.3% on the Ryerson Audio-Visual Database of Emotional Speech and Song, Berlin Database of Emotional Speech, Surrey Audio-Visual Expressed Emotion, and Interactive Emotional Dyadic Motion Capture datasets. Furthermore, it achieved 83% accuracy on another dataset created for this research—the Saudi Higher Education Instructor Emotions dataset. The results demonstrate this model’s considerable accuracy in detecting emotions in speech data across different languages and datasets.

Preprint ARTICLE | doi:10.20944/preprints202402.0813.v1

Large Scale Speech Recognition for Low Resource Language Amharic, an End-to-End Approach

Yohannes Ayana Ejigu, Tesfa Tegegne Asfaw

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Automatic speech recognition; Convolutional Neural Network; Connectionist Temporal Classification; End-to-End; Neural network; Erosion; Recurrent Neural Network

Online: 15 February 2024 (02:11:18 CET)

Show abstract| Download PDF| Share

Preprint REVIEW | doi:10.20944/preprints201903.0033.v1

Augmentative and Alternative Communication (AAC) Advances: A Review of Configurations for Speech Disabled Individuals

Yasmin Elsahar, Sijung Hu, Kaddour Bouazza-Marouf, David Kerr, Annysa Mansor

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: augmentative and alternative communication; assistive technologies; sensing modalities; signal processing; voice communication; machine learning; mobile health; speech disability

Online: 4 March 2019 (10:14:44 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202302.0035.v1

Hearing Aids Reduce Self-Perceived Difficulties in Noise for Listeners with Normal Audiograms

Kiri Mealings, Joaquin T. Valderrama, Jorge Mejia, Ingrid Yeend, Elizabeth F. Beach, Brent Edwards

Subject: Medicine And Pharmacology, Otolaryngology Keywords: Speech-in-noise hearing difficulties; Hidden hearing loss (HHL); hearing aids; self-report; Reaction time; Ecologically momentary assessment (EMA)

Online: 2 February 2023 (08:37:41 CET)

Show abstract| Download PDF| Supplementary Files| Share

Preprint ARTICLE | doi:10.20944/preprints201904.0274.v1

Improving the Translation Environment for Professional Translators

Vincent Vandeghinste, Tom Vanallemeersch, Liesbeth Augustinus, Bram Bulté, Frank Van Eynde, Joris Pelemans, Lyan Verwimp, Patrick Wambacq, Geert Heyman, Marie-Francine Moens, Iulianna van der Lek-Ciudin, Frieda Steurs, Ayla Rigouts Terryn, Els Lefever, Arda Tezcan, Lieve Macken, Véronique Hoste, Joke Daems, Joost Buysschaert, Sven Coppers, Jan Van den Bergh, Kris Luyten

Subject: Social Sciences, Language And Linguistics Keywords: computer-aided translation; machine translation; speech translation; translation memory-machine translation integration; user interface; domain-adaptation; human-computer interface

Online: 25 April 2019 (07:59:18 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202308.0742.v1

The Role of Big Five Personality Traits, Basic Psychological Need Satisfaction, and Need Frustration in Predicting Athletes' Organic Self-Talk

Aristea Karamitrou, Nikos Comoutos, Evangelos Brisimis, Alexander T. Latinjak, Antonis Hatzigeorgiadis, Yannis Theodorakis, Georgios Loules, Yannis Tzioumakis, Charalampos Krommidas

Subject: Public Health And Healthcare, Public Health And Health Services Keywords: inner speech; spontaneous self-talk; goal-directed self-talk; big five personality traits; self-determination theory; autonomy; competence; relatedness; sport

Online: 9 August 2023 (10:31:01 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202403.0619.v1

Revolutionizing Radiological Analysis: The Future of French Language Automatic Speech Recognition in Healthcare

Mariem Jelassi, Oumayma Jamai, Jacques Demongeot

Subject: Medicine And Pharmacology, Other Keywords: Automatic Speech Recognition (ASR), Medical Transcription, Radiology, Whisper Large-v2 Model, Language-Specific ASR Systems, French Language Processing, AI in Healthcare

Online: 11 March 2024 (12:16:52 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202402.1557.v1

Deep Learning-Based Speech and Vision Synthesis to Improve Phishing Attack Detection through a Multi-layer Adaptive Framework

Tosin Ige, Christopher Kiekintveld, Aritran Piplai

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Phishing; Random Forest; Deep Learning; Recurrent Neural Network; Long Short-Term Memory; Speech Synthesis; Vision Synthesis; Phishing Detection Framework; Adaptive Framework

Online: 27 February 2024 (15:39:23 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202101.0005.v1

Effects of tDCS on Sound Duration in Patients with Apraxia of Speech in Primary Progressive Aphasia

Charalambos Themistocleous, Kimberly Webster, Kyrana Tsapkini

Subject: Medicine And Pharmacology, Immunology And Allergy Keywords: apraxia of speech (AOS); transcranial direct current stimulation (tDCS); primary progressive aphasia (PPA); inferior frontal gyrus (IFG); sound duration; brain stimulation

Online: 4 January 2021 (10:19:48 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202402.0754.v1

Enhancing Amharic Speech Recognition in Noisy Conditions through End-to-End Deep Learning

Yohannes Ayana Ejigu, Tesfa Tegegne Asfaw

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Automatic speech recognition, Convolutional Neural Network, Connectionist Temporal Classification, End-to-End, Neural network, Noisy, Recurrent Neural Network, Subspace filtering, Spectral Subtraction

Online: 13 February 2024 (14:26:24 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202307.0413.v1

Development of a Voice Virtual Assistant for the Geospatial Data Visualization Application on the Web

Homeyra Mahmoudi, Silvana Camboim, Maria Antonia Brovelli

Subject: Computer Science And Mathematics, Other Keywords: Voice user interface; Geographic Information System; human-computer interaction; multimodal interface; natural language; Web application; Natural language interaction; Voice virtual assistant; Speech recognition

Online: 6 July 2023 (10:08:55 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202106.0687.v1

Automatic Speech Recognition (ASR) Systems Applied to Pronunciation Assessment of L2 Spanish for Japanese Speakers

Cristian Tejedor-García, Valentín Cardeñoso-Payo, David Escudero-Mancebo

Subject: Physical Sciences, Acoustics Keywords: automatic speech recognition (ASR); automatic assessment tools; foreign language pronunciation; pronunciation training; computer-assisted pronunciation training (CAPT); automatic pronunciation assessment; learning environments; minimal pairs

Online: 29 June 2021 (07:31:41 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202305.0402.v1

The Self Course: Lessons Learned from Students’ Weekly Questions

Alain Morin

Subject: Social Sciences, Psychology Keywords: self; course; self-reflection; self-rumination; self-knowledge; mindfulness; prospection; autobiography; self-regulation; self-recognition; self-esteem; culture; inner speech; traumatic brain injury; Theory-of-Mind

Online: 6 May 2023 (09:32:55 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202310.1830.v1

Transcranial Direct Current Stimulation (tDCS) Paired with ReST Training in Apraxia of Speech in Young Adult with Trisomy 21

Ester Miyuki Nakamura-Palacios, Aldren Thomazini Falçoni Júnior, Gabriela Lolli Tanese, Ana Carla Estellita Vogeley, Aravind Kumar Namasivayam

Subject: Medicine And Pharmacology, Neuroscience And Neurology Keywords: Apraxia of speech; Trisomy 21 (Down syndrome); transcranial Direct Current Stimulation (tDCS); Rapid Syllable Transition Training (ReST); Broca’s area; Wernicke’s area; supramarginal gyrus; Sylvian Temporal Parietal Junction

Online: 30 October 2023 (07:16:51 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202308.0528.v1

Performance of Lip-sync instead of Speech Imagery, New Combination Signals, Supplement Bond Graph Classifier and Deep Formula Detection as Confidents Extraction and Roots Detection Classifier for EEG and BCI

Ahmad Naebi, Zuren Feng

Subject: Engineering, Bioengineering Keywords: Speech Imagery; Mental Task; Machine Leaning; Feature Extraction; Common spatial pattern (CSP); Filter bank Common Spatial Pattern (FBCSP); Brain – Computer Interface (BCI); Principal Components Analysis (PCA); Feature Selection; Channel Selection; Mutual Information; Lagrange Formula; Deep Learning; SVM Classifier

Online: 7 August 2023 (10:23:13 CEST)

Show abstract| Download PDF| Share

Nowadays, brain signal processing is performed rapidly in various brain-computer interface (BCI) applications. Most researchers focus on developing new methods for the future or improving the basic implemented models to identify the optimum standalone feature set. Our research focuses on four ideas. One of them introduces future communication models, and the others are for improving old models or methods. These are: 1) new communication imagery model instead of speech imager using the mental task: Due to speech imagery is very difficult, and it is impossible to imagine sound for all of the characters in all of the languages. Our research introduces a new mental task model for all languages that call Lip-sync imagery. This model can use for all characters in all languages. This paper implemented two lip-sync for two sounds, characters or letters. 2) New combination Signals: Selecting an inopportune frequency domain can lead to inefficient feature extraction. Therefore, domain selection is so important for processing. This combination of limited frequency ranges proposes a preliminary for creating Fragmentary Continuous frequency. For the first model, two s intervals of 4 Hz as filter banks were examined and tested. The primary purpose is to identify the combination of filter banks with 4Hz (scale of each filter bank) from the 4Hz to 40Hz frequency domain as new combination signals (8Hz) to obtain well and efficient features using increasing distinctive patterns and decreasing similar patterns of brain activities.3) new supplement bond graph classifier for SVM classifier: When SVM linear uses in very noisy, the performance is decreased. But we introduce a new bond graph linear classifier to supplement SVM linear in noisy data. 4) a deep formula recognition model: it converts the data of the first layer into a formula model (formula extraction model). The main goal is to reduce the noise in the subsequent layers for the coefficients of the formulas. The output of the last layer is the coefficients selected by different functions in different layers. Finally, the classifier extracts the root interval of the formulas, and the diagnosis does based on the root interval. For all of the ideas achieved the results of implementing methods. The results are between 55% to 98%. Less result is 55% for the deep detection formula, and the highest result is 98% for new combination signals.

Search Results

96 articles found