Search | Preprints.org

Preprint ARTICLE | doi:10.20944/preprints202010.0526.v1

Animal Sound Classification Using Dissimilarity Spaces

Loris Nanni, Sheryl Brahnam, Alessandra Lumini, Gianluca Maguolo

Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: audio classification; dissimilarity space; siamese network; ensemble of classifiers; pattern recognition; animal audio

Online: 26 October 2020 (13:57:01 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201804.0258.v2

A Multi-Resolution Approach for Audio Classification

Sergey Voronin, Alexander Grushin

Subject: Computer Science And Mathematics, Information Systems Keywords: audio classification; multi-resolution analysis; LSTM; auto-ml

Online: 19 July 2018 (05:53:20 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202312.0766.v1

Performance Analysis of Deep Learning Model Compression Techniques for Audio Classification on Edge Devices

Afsana Rahman Mou, Mariofanna Milanova

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: model compression; deep learning; audio classification; LSTM; CNN; edge Device

Online: 12 December 2023 (05:09:35 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202108.0277.v1

A Randomized Bag-of-Birds Approach to Study Robustness of Automated Audio Based Bird Species Classification

Burooj Ghani, Sarah Hallerberg

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Bioacoustics; Machine Hearing; Bird sound recognition; Artificial Neural Networks; Audio Signal Processing

Online: 12 August 2021 (13:34:50 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201807.0185.v1

Deep Learning on Low-Resource Datasets

Veronica Morfi, Dan Stowell

Subject: Engineering, Electrical And Electronic Engineering Keywords: deep learning; multi-task learning; audio event detection; audio tagging; weak learning; low-resource data

Online: 10 July 2018 (16:05:15 CEST)

Show abstract| Download PDF| Share

Preprint BRIEF REPORT | doi:10.20944/preprints202310.0690.v2

Precision Location Keyword Detection Using Offline Speech Recognition Technique

Mohsin Imam, Gaurav Gupta

Subject: Computer Science And Mathematics, Other Keywords: Keyword Detection; Audio Models; Speech Processing

Online: 7 November 2023 (02:34:57 CET)

Show abstract| Download PDF| Share

This study introduces an original comprehensive system centered on identifying specific terms that indicate a user's position, particularly the discrete values representing latitude and longitude. This system not only detects these terms but also retrieves the corresponding numerical data for accurate and efficient determination of locations. The importance of this study can be applied various fields, notably aiding offline operations of military personnel, who often lack internet access. In such scenarios, precise awareness of location is vital for strategic manoeuvres, rescue operations, and navigating unfamiliar landscapes. The system allows these personnel by allowing them to extract exact location coordinates from spoken terms, thereby enhancing their awareness even in challenging surroundings. Apart from its military utility, the project holds broader significance. Teams responding to emergencies, personnel involved in disaster management, and exploratory missions can all gain from this technology during disruptions in communication infrastructure. Furthermore, travelers, adventurers, and outdoor enthusiasts can utilize this system to accurately determine their positions in remote areas without relying on online maps. We used offline speech recognition techniques to precisely transcribe spoken terms, achieving an accuracy of over 91.3% and a word error rate of 4.2%. For sound recognition, the OpenAI Whisper model was used, and a conversion process from SpeechRecognition to AudioSegmentation was implemented, followed by transforming the audio into .wav format, we have also developed the interface of the app to use it efficiently using Streamlit. This was done to ensure seamless compatibility with the Whisper model and uninterrupted audio input. By training the system to identify specific linguistic linked to location, it achieves robust detection and extraction of relevant terms. This approach eliminates the necessity for constant internet connectivity, rendering it exceptionally useful in remote, offline, and resource-limited situations.

Preprint ARTICLE | doi:10.20944/preprints202310.0722.v1

A Neural Network Architecture for Children’s Audio-Visual Emotion Recognition

Anton Matveev, Yuri Matveev, Olga Frolova, Aleksandr Nikolaev, Elena Lyakso

Subject: Computer Science And Mathematics, Computer Science Keywords: Audio-visual speech; emotion recognition; children

Online: 13 October 2023 (07:11:19 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201811.0509.v1

Recognition of Urban Sound Events using Deep Context-Aware Feature Extractors and Handcrafted Features

Theodoros Giannakopoulos, Stavros Perantonis

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: audio recognition; context-aware; deep learning

Online: 20 November 2018 (16:32:16 CET)

Show abstract| Download PDF| Share

Preprint DATA DESCRIPTOR | doi:10.20944/preprints202310.0514.v2

Korean Audio-Visual Dataset of Characters in 3D Animation: Construction and Validation

Soobin Hyun, Yeongmin Son, Jae Wan Park

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: anime character; 3D animation; audio-visual dataset

Online: 2 November 2023 (10:59:40 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202108.0185.v1

Securing Audio Using AES-based Authenticated Encryption with Python

Jessy Ayala

Subject: Computer Science And Mathematics, Computer Science Keywords: AES; Audio analysis; Authenticated encryption; Cryptography; Python

Online: 9 August 2021 (09:31:46 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202104.0766.v1

On the Importance of Passive Acoustic Monitoring Filters

Rafael Aguiar, Gianluca Maguolo, Loris Nanni, Yandre Costa, Carlos Silla

Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: PAM; Passive acoustic monitoring; audio classiﬁcation; texture classiﬁcation; PAM- 16 ﬁlter; experimental protocols for audio classiﬁcation; statistical tests.

Online: 29 April 2021 (07:55:09 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202007.0209.v1

Low-order Spherical Harmonic HRTF Restoration using a Neural Network Approach

Benjamin Tsui, William A. P. Smith, Gavin Kearney

Subject: Engineering, Control And Systems Engineering Keywords: Deep learning; Head Related Transfer Function (HRTF); Restoration; Ambisonics; Spatial Audio; Spherical harmonic; Audio signal processing; Denoising; Auto-Encoder; Neural Network

Online: 10 July 2020 (08:58:11 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202401.2114.v1

Sound of Surveillance: Enhancing Machine Learning-Driven Drone Detection with Advanced Acoustic Augmentation

Sebastian Kümmritz

Subject: Engineering, Electrical And Electronic Engineering Keywords: UAV classification; machine learning; audio data augmentation; UAV detection

Online: 30 January 2024 (11:38:51 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202306.0311.v1

Patient Attitudes Regarding Audio-Only Telemedicine in Rural Minnesota

Samuel Nelson, Nicholas Battis, Ryan Harden

Subject: Public Health And Healthcare, Health Policy And Services Keywords: Rural; Telemedicine; Telehealth; Patient-Centered; Attitudes; Perceptions; Audio-only

Online: 5 June 2023 (10:51:10 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201712.0001.v3

A Sequential Algorithm for Signal Segmentation

Paulo Hubert, Linilson Padovese, Julio Stern

Subject: Computer Science And Mathematics, Probability And Statistics Keywords: signal processing; bayesian methods; subaquatic audio; hydrophone; unsupervised learning

Online: 8 January 2018 (18:29:11 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202401.0906.v1

Preliminary Technical Validation of LittleBeats™: A Multimodal Sensing Platform to Capture Cardiac Physiology, Motion, and Vocalizations

Bashima Islam, Nancy L McElwain, Jialu Li, Maria Davila, Yannan Hu, Kexin Hu, Jordan M Bodway, Ashutosh M Dhekne, Romit Roy Choudhury, Mark Hasegawa-Johnson

Subject: Engineering, Electrical And Electronic Engineering Keywords: wearable devices; multimodal sensing; audio; electrocardiogram; inertial measurement unit; infants

Online: 11 January 2024 (10:56:16 CET)

Show abstract| Download PDF| Supplementary Files| Share

Background: The use of wearable devices has burgeoned over the past decade, including wearables for infants and young children. Typically, such devices assess a single modality, and few have undergone scientific validation. To address this gap, we developed an infant wearable platform, LittleBeats™, that integrates electrocardiogram (ECG), motion, and audio sensors on a single printed circuit board to permit daylong remote assessments of infants and caregivers in home environments. Objective: Complementing our prior reports focused on LittleBeats™ data collected among infants and children in the home context, our main objective here is to present a technical validation of each sensor modality against established laboratory protocols and gold-standard equipment. Materials and Methods: We conducted five studies, including assessments of (a) interbeat interval (IBI) data obtained from the LittleBeats™ ECG sensor versus the gold-standard BIOPAC among adults (Study 1, N=16) and infants (Study 2, N=5), (b) performance of automated activity recognition (upright vs. walk vs. glide vs. squat ) among adults using accelerometer data obtained from LittleBeats™ versus Google Pixel 1 smartphone (Study 3, N=12), and (c) performance of speech emotion recognition (SER; Study 4, N=8) and automatic speech recognition (ASR; Study 5, N =12) algorithms among adult samples using audio data from LittleBeats™ versus smartphone. Results: Results for IBI data obtained from the LittleBeats™ ECG sensor indicate acceptable mean absolute percent error (MAPE) rates for both the adults (MAPE = 5.29% to 5.97%; Study 1) and infants (MAPE =0.96% to 1.66%; Study 2) across low- and high-challenge sessions, as well as expected patterns of change in respiratory sinus arrythmia (RSA) across sessions. For activity recognition (Study 3), LittleBeats™ data showed good to excellent performance (Accuracy = 89%, F1-score=88%; Cohen’s kappa =.79), although the smartphone outperformed LittleBeats™ by less than 4%. Finally, performance on the SER applied to LittleBeats™ versus smartphone audio data (Study 4) indicated comparable performance, with a matched-pairs test indicating no significant difference in error rates (p=.26). On the ASR task (Study 5), the best performing algorithm yielded relatively low word error rates, although LittleBeats™ (4.16%) versus smartphone (2.73%) error rates are somewhat higher. Discussion: Taken together, results from these controlled laboratory studies conducted predominantly with adults indicate that the LittleBeats™ sensors yield data quality that is largely comparable to those obtained from gold-standard devices and protocols that have been used extensively in prior research. The advantages of the LittleBeats™ platform is the (a) integration of multiple sensors into one platform that was (b) designed specifically for use with infants and young children. Leveraging data from the multiple modalities and fine-tuning postprocessing steps to further increase data quality, combined with assessing data quality among infants and young children in their natural environments, will be key next steps in this research program.

Preprint COMMUNICATION | doi:10.20944/preprints202307.2035.v1

Children of AI: A Protocol for Managing the Born-Digital Ephemera Spawned by ChatGPT

Dirk H.R. Spennemann

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: audio-visual archiving; ChatGPT; cultural heritage digital ephemera; publications ethics

Online: 31 July 2023 (04:15:50 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202203.0252.v1

Leveraging Immersive Technologies to Support Blended Learning Post Covid-19

Simon Colreavy-Donnelly, Alan Ryan, Stuart O'Connor, Fabio Caraffini, Stefan Kuhn, Salim Hasshu

Subject: Computer Science And Mathematics, Mathematical And Computational Biology Keywords: Audio-Visual Technologies; Blended Learning; Pedagogy; Virtual Learning Environments; Virtual Reality

Online: 17 March 2022 (11:05:27 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202010.0343.v2

Audio-Based Aircraft Detection System for Safe RPAS BVLOS Operations

Jorge Mariscal-Harana, Víctor Alarcón, Fidel González, Juan José Calvente, Francisco Javier Pérez-Grau, Antidio Viguria, Aníbal Ollero

Subject: Engineering, Control And Systems Engineering Keywords: deep learning; sound event detection; convolutional neural networks; audio processing; embedded systems

Online: 9 November 2020 (14:21:39 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201812.0086.v4

The User-Pleasant Video Skimming by Multi-Modal Keywords Semantics

Yiqing Shen, Yingbo Li

Subject: Computer Science And Mathematics, Computer Science Keywords: multi-model information fusion; video skim-ming; audio and text classification; keyframe extraction

Online: 5 August 2019 (03:48:49 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201802.0076.v1

Computerized Data Interpretation for Concrete Assessment with Air-Coupled Impact-Echo: An Online Learning Approach

Jiaxing Ye, Takumi Kobayashi, Masaya Iwata, Hiroshi Tsuda, Masahiro Murakawa

Subject: Engineering, Civil Engineering Keywords: non-destructive evaluation; hammering inspection; audio signal processing; machine learning; online learning

Online: 9 February 2018 (06:55:24 CET)

Show abstract| Download PDF| Share

Preprint REVIEW | doi:10.20944/preprints202212.0494.v1

Research in Computational Expressive Music Performance and Popular Music Production: A Potential Field of Application?

Pierluigi Bontempi, Filippo Carnovalini, Antonio Rodà, Sergio Canazza

Subject: Arts And Humanities, Music Keywords: computational music expressive performance; popular music; music production; Digital Audio Workstation; virtual instruments

Online: 26 December 2022 (15:52:01 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202309.0159.v1

The Impact of Reading Modalities and Text Types on Reading in School-Age Children: An Eye-tracking Study

Wi-Jiwoon Kim, Seo Rin Yoon, Seohyun Nam, Yunjin Lee, Dongsun Yim

Subject: Social Sciences, Education Keywords: eye-tracking; reading modality; audio-assisted reading; text type; reading comprehension; school-age children

Online: 5 September 2023 (02:26:49 CEST)

Show abstract| Download PDF| Share

Preprint TECHNICAL NOTE | doi:10.20944/preprints202404.0807.v1

Computational Intelligence Assisted Less-Expensive Privacy-Preserving Optimization in Audio-Video Content

Amir Parnianifard

Subject: Engineering, Industrial And Manufacturing Engineering Keywords: Computational Intelligence; Privacy-Preserving Optimization; Audio-Visual Content; Evolutionary Computation; Metamodel-Based Learning; Signal Watermarking

Online: 17 April 2024 (02:09:11 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202008.0707.v1

Comparative Study of the Restorative Effects of Forest and Urban Videos during Covid-19 Lockdown: Intrinsic and Benchmark Value

Federica Zabini, Lorenzo Albanese, Francesco Riccardo Becheri, Gioele Gavazzi, Fiorenza Giganti, Fabio Giovanelli, Giorgio Gronchi, Andrea Guazzini, Marco Laurino, Qing Li, Tessa Marzi, Francesca Mastorci, Francesco Meneguzzo, Stefania Righi, Maria Pia Viggiano

Subject: Social Sciences, Cognitive Science Keywords: Anxiety; Audio-Visual stimulation; COVID-19; Environmental enrichment; Forest environments; Forest therapy; Lockdown; Mental health; Stress; Quarantine

Online: 31 August 2020 (05:20:50 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202312.0324.v1

Exploring the Impact of Ambient and Character Sounds on Player Experience in Video Games

Luise Haehn, Sabine J. Schlittmeier, Christian Böffel

Subject: Social Sciences, Psychology Keywords: subjective immersion; avatar identification; video games; ambient sounds; character sounds; soundscape; audio-visual interaction; game experience; sound design

Online: 6 December 2023 (07:07:48 CET)

Show abstract| Download PDF| Share

Search Results

26 articles found