REVIEW | doi:10.20944/preprints202011.0657.v2
Subject: Medicine & Pharmacology, Allergology Keywords: hypnosis; multimodal monitoring; entropy; qNOX; qCON; bispectral index; surgical plethismographic index; general anaesthesia; patient safety
Online: 25 January 2021 (17:02:57 CET)
With the development of general anesthesia techniques and anesthetic substances, brought new horizons for the expansion and improvement of surgical techniques. Nevertheless, more complex surgical procedures brought a higher complexity and longer duration for general anesthesia that led to a series of adverse events such as hemodynamic instability, under- or overdosage of anesthetic drugs, as well as an increased number of post-anesthetic events. In order to adapt the anesthesia according to the particularities of each patient, the multimodal monitoring of these patients is highly recommended. Classically, general anesthesia monitoring consists of the analysis of vital functions and gas exchange. Multimodal monitoring refers to the concomitant monitoring of the degree of hypnosis and the nociceptive-antinociceptive balance. By titrating anesthetic drugs according to these parameters, clinical benefits can be obtained, such as hemodynamic stabilization, reduction of awakening times, and the reduction of post-operative complications. Another important aspect is the impact on the status of inflammation and the redox balance. By minimizing inflammatory and oxidative impact one can achieve a faster recovery that will lead to both increased patient satisfaction and an increase in patient safety. The purpose of this literature review is to present the most modern multimodal monitoring techniques, respectively to discuss the particularities of each technique.
ARTICLE | doi:10.20944/preprints202208.0353.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: recommender; multimodal; context-aware
Online: 19 August 2022 (03:07:04 CEST)
The advent of the era of big data will bring more convenience to people and greater development to society. But at the same time, it will also bring people the problem of 'information overload', i.e., when people are faced with huge information data, there are many redundant and worthless data. The redundant and worthless data information seriously interferes with the accurate selection of information data. Even though people can use Internet search engines to access information data, they cannot meet the individual needs of specific users in specific contexts. The personalized needs of a particular user in a particular context. Therefore, how to find useful and valuable information quickly has become one of the key issues in the development of big data. With the advent of the era of big data, recommendation systems, as an important technology to alleviate information overload, have been widely used in the field of e-commerce. Recommender systems suffer from a key problem: data sparsity. The sparsity of user history rating data causes insufficient training of collaborative filtering recommendation models, which leads to a significant decrease in the accuracy of recommendations. In fact, traditional recommendation systems tend to focus on scoring information and ignore the contextual context in which users interact. There are various contextual modal information in people's real life, which also plays an important role in the recommendation process. In this paper we achieve data degradation and feature extraction, solving the problem of sparse data in the recommendation process. An interaction context-aware sub-model is constructed based on a tensor decomposition model with interaction context information to model the specific influence of interaction context in the recommendation process. Then an attribute context-aware sub-model is constructed based on the matrix decomposition model and using attribute context information to model the influence of user attribute contexts and item attribute contexts on recommendations. In the process of building the model, the method not only utilizes the explicit feedback rating information of users in the original dataset, but also utilizes the interaction context and attribute context information of the implicit feedback and the unlabeled rating data. We evaluate our model by extensive experiments. The results illustrate the effectiveness of our recommender model.
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: Indoor Localization; Sensor Fusion; Multimodal Deep Neural Network; Multimodal Sensing; WiFi Fingerprinting; Pedestrian Dead Reckoning
Online: 13 October 2021 (12:14:39 CEST)
Many engineered approaches have been proposed over the years for solving the hard problem of performing indoor localisation using smartphone sensors. However, specialising these solutions for difficult edge cases remains challenging. Here we propose an end-to-end hybrid multimodal deep neural network localisation system, MM-Loc, relying on zero hand-engineered features, learning them automatically from data instead. This is achieved by using modality-specific neural networks to extract preliminary features from each sensing modality, which are then combined by cross-modality neural structures. We show that our choice of modality-specific neural architectures is capable of estimating the location with good accuracy independently. But for better accuracy, a multimodal neural network fusing the features of early modality-specific representations is a better proposition. Our proposed MM-Loc solution is tested on cross-modality samples characterised by different sampling rates and data representation (inertial sensors, magnetic and WiFi signals), outperforming traditional approaches for location estimation. MM-Loc elegantly trains directly from data unlike conventional indoor positioning systems, which rely on human intuition.
ARTICLE | doi:10.20944/preprints202206.0412.v1
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: Dictionary learning, Recommender system, Personalized recommendation, Multimodal
Online: 30 June 2022 (03:43:30 CEST)
In today’s Web 2.0 era, online social media has become an integral part of our lives. In the course of the information revolution, the form of information has undergone a radical change, from simple text information to today’s integrated video, image, text and audio, and there has also been a great change in the way of dissemination and access, as people nowadays do not just rely on traditional media to passively receive information, but more actively and selectively obtain information from social media. Therefore, it has become a great challenge for us to effectively utilize these massive and integrated multi-modal media information to form an effective system of retrieval, browsing, analysis and usage. Unlike movies and traditional long-form video content, micro-videos are usually short in length, between a few seconds and tens of seconds, which allows users to quickly browse different contents and make full use of the fragmented time in their lives, while users can also share their micro-videos to their friends or the public, forming a unique social way. Video contains rich multimodal information, and fusing information from multiple modalities in a video recommendation task can improve the accuracy of the video recommendation task.According to the micro-video recommendation task, a new combinatorial network model is proposed to combine the discrete features of each modality into the overall features of various modalities through the network, and then fuse the various modal features to obtain the overall video features, which will be used for recommendation. In order to verify the effectiveness of the algorithm proposed in this paper, experiments are conducted in the public dataset, and it is shown the effectiveness of our model.
ARTICLE | doi:10.20944/preprints201808.0312.v1
Subject: Engineering, Other Keywords: affordance; empathy; HRI; emotion; multimodal; allocentric; libraries
Online: 17 August 2018 (13:45:09 CEST)
Affordances are an important concept in cognition, which can be applied to robots in order to perform a successful human-robot interaction (HRI). In this paper we explore and discuss the idea of emotional affordances and propose a viable model for implementation into HRI. We consider “2-ways” affordances: perceived object triggering an emotion, and perceived human emotion expression triggering an action. In order to make the implementation generic, the proposed model includes a library that can be customised depending on the specific robot and application’s scenario. We present the AAA (Affordance-Appraisal-Arousal) model, which incorporates Plutchik’s Wheel of Emotions, and show some examples of simulation and possible scenarios.
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: Multimodal Machine Learning; Deep Learning; Hate Speech Detection
Online: 15 March 2021 (13:46:27 CET)
Hateful and abusive speech presents a major challenge for all online social media platforms. Recent advances in Natural Language Processing and Natural Language Understanding allow more accurate detection of hate speech in textual streams. This study presents a multimodal approach to hate speech detection by combining Computer Vision and Natural Language processing models for abusive context detection. Our study focuses on Twitter messages and, more specifically, on hateful, xenophobic and racist speech in Greek aimed at refugees and migrants. In our approach we combine transfer learning and fine-tuning of Bidirectional Encoder Representations from Transformers (BERT) and Residual Neural Networks (Resnet). Our contribution includes the development of a new dataset for hate speech classification, consisting of tweet ids, along with the code to obtain their visual appearance, as they would have been rendered in a web browser. We have also released a pre-trained Language Model trained on Greek tweets, which has been used in our experiments. We report a consistently high level of accuracy (accuracy score=0.970, f1-score=0.947 in our best model) in racist and xenophobic speech detection.
REVIEW | doi:10.20944/preprints202102.0349.v1
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: Learning (artificial intelligence); Neural networks; Activity recognition; Multimodal sensors
Online: 17 February 2021 (09:26:44 CET)
The growing use of sensor tools and the Internet of Things requires sensors to understand the applications. There are major difficulties in realistic situations, though, that can impact the efficiency of the recognition system. Recently, as the utility of deep learning in many fields has been shown, various deep approaches were researched to tackle the challenges of detection and recognition. We present in this review a sample of specialized deep learning approaches for the identification of sensor-based human behaviour. Next, we present the multi-modal sensory data and include information for the public databases which can be used in different challenge tasks for study. A new taxonomy is then suggested, to organize deep approaches according to challenges. Deep problems and approaches connected to problems are summarized and evaluated to provide an analysis of the ongoing advancement in science. By the conclusion of this research, we are answering unanswered issues and providing perspectives into the future.
ARTICLE | doi:10.20944/preprints202201.0059.v1
Subject: Earth Sciences, Geochemistry & Petrology Keywords: microporous carbonates; multimodal porosity; primary drainage; capillary invasion; mixed wettability
Online: 6 January 2022 (10:03:11 CET)
Improved oil recovery from tight carbonate formations may provide the world with a major source of lower-rate power over several decades. Here we provide an overview of the Arab D formation in the largest oil field on earth, the Ghawar. We investigate the occurrence of microporosity of different origins and sizes using scanning electron microscopy (SEM) and pore casting techniques. Then, we present a robust calculation of the probability of invasion and oil saturation distribution in the nested micropores using mercury injection capillary pressure data available in the literature. We show that large portions of the micropores in Arab D formation would have been bypassed during primary drainage unless the invading crude oil ganglia were sufficiently long. Considering the asphaltenic nature of oil in the Ghawar, we expect the invaded portions of the pores to turn mixed-wet, thus becoming inaccessible to waterflooding until further measures are taken to modify the system’s chemistry.
REVIEW | doi:10.20944/preprints202208.0141.v1
Subject: Medicine & Pharmacology, Anesthesiology Keywords: hemorrhagic shock; multimodal monitoring; individualized therapy; fluid therapy; critical care; trauma
Online: 8 August 2022 (09:56:33 CEST)
Worldwide, one of the main causes of death among young adults is multiple trauma. In these pa-tients hemorrhagic shock represents the leading cause for worsening of the clinical status and for increased morbidity and mortality. This is due to a multifactorial complex involving cellular, bi-ological, and biophysical mechanisms. The most important mechanisms affecting clinical out-come are oxidative stress, the augmentation of pro-inflammatory status, immune deficiency, dis-ruptions in the coagulation cascade, imbalances in electrolyte and acid-base homeostasis. Poly-trauma patients in hemorrhagic shock need adequate fluid management to ensure hemodynamic stability that must consider not only the maintenance of adequate blood pressure, but also the ad-equate oxygenation of tissues for optimal cellular function. In the current clinical practice, fluid resuscitation in polytrauma patients uses a variety of widely studied pharmacological products, such as crystalloids, colloids, blood transfusions, and the infusion of other blood products. Alt-hough these products exist, an agreement was not reached on a standard administration protocol that could be generally applied for all patients. Moreover, numerous studies have reported a se-ries of adverse events related to fluid resuscitation and to the inadequate use of these products. This review aims at describing the impact the administration of all the solutions used in fluid re-suscitation might have on the cellular and pathophysiological mechanisms in the case of poly-trauma patients suffering from hemorrhagic shock.
ARTICLE | doi:10.20944/preprints202012.0092.v1
Subject: Keywords: anomaly detection; machine Learning; large stand off magnetometry; multimodal data; RAPIDS-AI
Online: 4 December 2020 (07:13:40 CET)
Pipeline integrity is an important area of concern for the oil and gas, refining, chemical, hydrogen, carbon sequestration, and electric-power industries, due to the safety risks associated with pipeline failures. Regular monitoring, inspection, and maintenance of these facilities is therefore required for safe operation. Large standoff magnetometry (LSM) is a non-intrusive, passive magnetometer-based mea- surement technology that has shown promise in detecting defects (anomalies) in regions of elevated mechanical stresses. However, analyzing the noisy multi-sensor LSM data to clearly identify regions of anomalies is a significant challenge. This is mainly due to the high frequency of the data collection, mis-alignment between consecutive inspections and sensors, as well as the number of sensor measurements recorded. In this paper we present LSM defect identification approach based on ma- chine learning (ML). We show that this ML approach is able to successfully detect anomalous readings using a series of methods with increasing model complexity and capacity. The methods start from unsupervised learning with "point" methods and eventually increase complexity to supervised learning with sequence methods and multi-output predictions. We observe data leakage issues for some methods with randomized train/test splitting and resolve them by specific non-randomized splitting of training and validation data. We also achieve a 200x acceleration of support-vector classifier (SVC) method by porting computations from CPU to GPU leveraging the cuML RAPIDS AI library. For sequence methods, we develop a customized Convolutional Neural Network (CNN) architecture based on 1D convolutional filters to identify and characterize multiple properties of these defects. In the end, we report the scalability of the best-performing methods and compare them, for viability in field trials.
ARTICLE | doi:10.20944/preprints201811.0135.v1
Subject: Life Sciences, Biotechnology Keywords: iron oxide nanoparticles; multimodal nanoparticles; biodistribution; magnetic resonance imaging; aging; coating degradation
Online: 6 November 2018 (10:37:57 CET)
Medical imaging is an active field of research that fosters the necessity for novel multimodal imaging probes. In this line, nanoparticle-based contrast agents are of special interest, since those can host functional entities either within their interior, reducing potential toxic effects of the imaging tracers, and on their surface, providing high payloads of probes, due to their large surface-to-volume ratio. The long-term stability of the particles in solution is an aspect usually under-tackled during probe design in research laboratories, since their performance is generally tested briefly after synthesis. This may jeopardize a later translation into practical medical devices, due to stability reasons. To dig into the effects of nanoparticle aging in solution, respect to their behavior in vivo, iron oxide stealth nanoparticles were used at two stages (3 weeks vs. 9 months in solution), analyzing their biodistribution in mice. Both sets of nanoprobes showed similar sizes, zeta potentials and morphology, as observed by DLS and TEM but, fresh nanoparticles accumulated in the kidneys after systemic administration, while aged ones accumulated in liver and spleen, confirming an enormous effect of particle aging on their in vivo behavior, despite barely noticeable changes perceived on a simple inspection of their structural integrity.
ARTICLE | doi:10.20944/preprints202208.0447.v1
Subject: Medicine & Pharmacology, Anesthesiology Keywords: low-back pain (LBP); guidelines; gaps; evidence-based; acute pain; analgesics; multimodal analgesia; fixed doses combination (FDC)
Online: 26 August 2022 (04:36:13 CEST)
Acute low back pain (LBP) stands as a leading cause of activity limitation and work absenteeism, and its associated healthcare expenditures are expected to become substantial when acute LBP develops into a chronic and even refractory condition. Therefore, early intervention is crucial to prevent progression to chronic pain whose management is particularly challenging and for which the most effective pharmacological therapy is still controversial. Current guideline treatment recommendations vary and are mostly driven by expertise with opinion differing across different interventions. Thus, it is difficult to formulate evidence-based guidance when relatively few randomized clinical trials did explore the diagnosis and management of LBP while employing different selection criteria, statistical analyses, and outcome measurements. This narrative review aims to provide a critical appraisal of current acute LBP management by discussing the unmet needs and areas of improvement from bench-to-bedside and proposes multimodal analgesia as the way forward to attain an effective and prolonged pain relief and functional recovery in patients with acute LBP.
ARTICLE | doi:10.20944/preprints202201.0061.v1
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: BERT, Document Image Classification, EfficientNet, fine-tuned BERT, Hierarchical Attention Networks, Multimodal, RVL-CDIP, Two-stream, Tobacco-3482
Online: 6 January 2022 (10:08:38 CET)
Document classification is one of the most critical steps in the document analysis pipeline. There are two types of approaches for document classification, known as image-based and multimodal approaches. The image-based document classification approaches are solely based on the inherent visual cues of the document images. In contrast, the multimodal approach co-learns the visual and textual features, and it has proved to be more effective. Nonetheless, these approaches require a huge amount of data. This paper presents a novel approach for document classification that works with a small amount of data and outperforms other approaches. The proposed approach incorporates a hierarchical attention network(HAN) for the textual stream and the EfficientNet-B0 for the image stream. The hierarchical attention network in the textual stream uses the dynamic word embedding through fine-tuned BERT. HAN incorporates both the word level and sentence level features. While the earlier approaches rely on training on a large corpus (RVL-CDIP), we show that our approach works with a small amount of data (Tobacco-3482). To this end, we trained the neural network at Tobacco-3428 from scratch. Thereby, we outperform state-of-the-art by obtaining an accuracy of 90.3%. This results in a relative error reduction rate of 7.9%.
Subject: Behavioral Sciences, Applied Psychology Keywords: multimodal experiment; multisensory experiment; automatic device integration; open-source; PsychoPy; Unity; Virtual Reality (VR); Lab Streaming Layer; LabRecorder; LabRecorderCLI; Windows command line (cmd.exe)
Online: 12 October 2020 (07:06:28 CEST)
The human mind is multimodal. Yet most behavioral studies rely on century-old measures of behavior—task accuracy and latency (response time). Multimodal and multisensory analysis of human behavior creates a better understanding of how the mind works. The problem is that designing and implementing these experiments is technically complex and costly. This paper introduces versatile and economical means of developing multimodal-multisensory human experiments. We provide an experimental design framework that automatically integrates and synchronizes measures including electroencephalogram (EEG), galvanic skin response (GSR), eye-tracking, virtual reality (VR), body movement, mouse/cursor motion and response time. Unlike proprietary systems (e.g., iMotions), our system is free and open-source; it integrates PsychoPy, Unity and Lab Streaming Layer (LSL). The system embeds LSL inside PsychoPy/Unity for the synchronization of multiple sensory signals—gaze motion, electroencephalogram (EEG), galvanic skin response (GSR), mouse/cursor movement, and body motion—with low-cost consumer-grade devices in a simple behavioral task designed by PsychoPy and a virtual reality environment designed by Unity. This tutorial shows a step-by-step process by which a complex multimodal-multisensory experiment can be designed and implemented in a few hours. When conducting the experiment, all of the data synchronization and recoding of the data to disk will be done automatically.