ARTICLE | doi:10.20944/preprints202209.0196.v1
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: Autonomous Vehicles; Reinforcement Learning; Explainable Reinforcement Learning; XRL
Online: 14 September 2022 (08:13:44 CEST)
While machine learning models are powering more and more everyday devices, there is a growing need for explaining them. This especially applies to the use of Deep Reinforcement Learning in solutions that require security, such as vehicle motion planning. In this paper, we propose a method of understanding what the RL agent’s decision is based on. The method relies on conducting statistical analysis on a massive set of state-decisions samples. It indicates which input features have an impact on the agent’s decision and the relationships between decisions, the significance of the input features, and their values. The method allows us for determining whether the process of making a decision by the agent is coherent with human intuition and what contradicts it. We applied the proposed method to the RL motion planning agent which is supposed to drive a vehicle safely and efficiently on a highway. We find out that making such analysis allows for a better understanding agent’s decisions, inspecting its behavior, debugging the ANN model, and verifying the correctness of input values, which increases its credibility.
ARTICLE | doi:10.20944/preprints202112.0337.v1
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: Transfer learning; Reinforcement learning; Adaptive operator selection; Artificial bee colony
Online: 21 December 2021 (13:41:06 CET)
In the past two decades, metaheuristic optimization algorithms (MOAs) have been increasingly popular, particularly in logistic, science, and engineering problems. The fundamental characteristics of such algorithms are that they are dependent on a parameter or a strategy. Some online and offline strategies are employed in order to obtain optimal configurations of the algorithms. Adaptive operator selection is one of them, and it determines whether or not to update a strategy from the strategy pool during the search process. In the filed of machine learning, Reinforcement Learning (RL) refers to goal-oriented algorithms, which learn from the environment how to achieve a goal. On MOAs, reinforcement learning has been utilised to control the operator selection process. Existing research, however, fails to show that learned information may be transferred from one problem-solving procedure to another. The primary goal of the proposed research is to determine the impact of transfer learning on RL and MOAs. As a test problem, a set union knapsack problem with 30 separate benchmark problem instances is used. The results are statistically compared in depth. The learning process, according to the findings, improved the convergence speed while significantly reducing the CPU time.
ARTICLE | doi:10.20944/preprints202203.0094.v1
Subject: Engineering, Automotive Engineering Keywords: Smart scheduling; Smart Reservations; Reinforcement Learning; Electric vehicle charging; Electric Vehicle Charging Management platform; DQN Reinforcement Learning algorithm
Online: 7 March 2022 (09:20:13 CET)
Abstract: As the policies and regulations currently in place concentrate on environmental protection and greenhouse gas reduction, we are steadily witnessing a shift in the transportation industry towards electromobility. There are, though, several issues that need to be addressed to encourage the adoption of EVs at a larger scale. To this end, we propose a solution capable of addressing multiple EV charging scheduling issues, such as congestion management, scheduling a charging station in advance, and allowing EV drivers to plan optimized long trips using their EVs. The smart charging scheduling system we propose considers a variety of factors such as battery charge level, trip distance, nearby charging stations, other appointments, and average speed. Given the scarcity of data sets required to train the Reinforcement Learning algorithms, the novelty of the recommended solution lies in the scenario simulator, which generates the labelled datasets needed to train the algorithm. Based on the generated scenarios, we created and trained a neural network that uses a history of previous situations to identify the optimal charging station and time interval for recharging. The results are promising and for future work we are planning to train the DQN model using real-world data.
ARTICLE | doi:10.20944/preprints202005.0181.v1
Subject: Mathematics & Computer Science, Numerical Analysis & Optimization Keywords: Reinforcement learning; Cartpole; Q Learning; Mathematical Modeling
Online: 10 May 2020 (18:02:43 CEST)
The prevalence of differential equations as a mathematical technique has refined the fields of control theory and constrained optimization due to the newfound ability to accurately model chaotic, unbalanced systems. However, in recent research, systems are increasingly more nonlinear and difficult to model using Differential Equations only. Thus, a newer technique is to use policy iteration and Reinforcement Learning, techniques that center around an action and reward sequence for a controller. Reinforcement Learning (RL) can be applied to control theory problems since a system can robustly apply RL in a dynamic environment such as the cartpole system (an inverted pendulum). This solution successfully avoids use of PID or other dynamics optimization systems, in favor of a more robust, reward-based control mechanism. This paper applies RL and Q-Learning to the classic cartpole problem, while also discussing the mathematical background and differential equations which are used to model the aforementioned system.
ARTICLE | doi:10.20944/preprints202007.0598.v1
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: Reinforcement Learning; Simulation; Health Services Research; Operational Research
Online: 24 July 2020 (14:45:33 CEST)
Background and motivation: Combining Deep Reinforcement Learning (Deep RL) and Health Systems Simulations has significant potential, for both research into improving Deep RL performance and safety, and in operational practice. While individual toolkits exist for Deep RL and Health Systems Simulations, no framework to integrate the two has been established. Aim: Provide a framework for integrating Deep RL Networks with Health System Simulations, and to ensure this framework is compatible with Deep RL agents that have been developed and tested using OpenAI Gym. Methods: We developed our framework based on the OpenAI Gym framework, and demonstrate its use on a simple hospital bed capacity model. We built the Deep RL agents using PyTorch, and the Hospital Simulation using SimPy. Results: We demonstrate example models using a Double Deep Q Network or a Duelling Double Deep Q Network as the Deep RL agent. Conclusion: SimPy may be used to create Health System Simulations that are compatible with agents developed and tested on OpenAI Gym environments. GitHub repository of code: https://github.com/MichaelAllen1966/learninghospital
REVIEW | doi:10.20944/preprints201811.0510.v2
Subject: Engineering, Control & Systems Engineering Keywords: deep reinforcement learning; imitation learning; soft robotics
Online: 23 November 2018 (11:57:55 CET)
The increasing trend of studying the innate softness of robotic structures and amalgamating it with the benefits of the extensive developments in the field of embodied intelligence has led to sprouting of a relatively new yet extremely rewarding sphere of technology. The fusion of current deep reinforcement algorithms with physical advantages of a soft bio-inspired structure certainly directs us to a fruitful prospect of designing completely self-sufficient agents that are capable of learning from observations collected from their environment to achieve a task they have been assigned. For soft robotics structure possessing countless degrees of freedom, it is often not easy (something not even possible) to formulate mathematical constraints necessary for training a deep reinforcement learning (DRL) agent for the task in hand, hence, we resolve to imitation learning techniques due to ease of manually performing such tasks like manipulation that could be comfortably mimicked by our agent. Deploying current imitation learning algorithms on soft robotic systems have been observed to provide satisfactory results but there are still challenges in doing so. This review article thus posits an overview of various such algorithms along with instances of them being applied to real world scenarios and yielding state-of-the-art results followed by brief descriptions on various pristine branches of DRL research that may be centers of future research in this field of interest.
REVIEW | doi:10.20944/preprints202007.0693.v1
Online: 29 July 2020 (11:12:32 CEST)
This last decade, the amount of data exchanged in the Internet increased by over a staggering factor of 100, and is expected to exceed well over the 500 exabytes by 2020. This phenomenon is mainly due to the evolution of high speed broadband Internet and, more specifically, the popularization and wide spread use of smartphones and associated accessible data plans. Although 4G with its long-term evolution (LTE) technology is seen as a mature technology, there is continual improvement to its radio technology and architecture such as in the scope of the LTE Advanced standard, a major enhancement of LTE. But for the long run, the next generation of telecommunication (5G) is considered and is gaining considerable momentum from both industry and researchers. In addition, with the deployment of the Internet of Things (IoT) applications, smart cities, vehicular networks, e-health systems, and Industry 4.0, a new plethora of 5G services has emerged with very diverging and technologically challenging design requirements. These include: high mobile data volume per area, high number of devices connected per area, high data rates, longer battery life for low-power devices, and reduced end-to-end latency. Several technologies are being developed to meet these new requirements. Among these we list ultra-densification, millimeter Wave usage, antennas with massive multiple-input multiple-output (MIMO), antenna beamforming to increase spacial diversity, edge/fog computing, among others. Each of these technologies brings its own design issues and challenges. For instance, ultra-densification and MIMO will increase the complexity to estimate channel condition and traditional channel state information (CSI) estimation techniques are no longer suitable due to the complexity of the new scenarios. As a result, new approaches to evaluate network condition such as by continuously collecting and monitoring key performance indicators become necessary. Timely decisions are needed to ensure the correct operation of such network. In this context, deep learning (DL) models could be seen as one of the main tools that can be used to process monitoring data and automate decisions. As these models are able to extract relevant features from raw data (images, texts, and other types of unstructured data), the integration between 5G and DL looks promising and one that requires exploring. As main contributions, this paper presents a systematic review about how DL is being applied to solve some 5G issues. We examine data from the last decade and the works that addressed diverse 5G problems, such as physical medium state estimation, network traffic prediction, user device location prediction, self network management, among others. We also discuss the main research challenges when using DL models in 5G scenarios and identify several issues that deserve further consideration.
REVIEW | doi:10.20944/preprints202003.0309.v1
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: economics; deep reinforcement learning; deep learning; machine learning
Online: 20 March 2020 (07:13:42 CET)
The popularity of deep reinforcement learning (DRL) methods in economics have been exponentially increased. DRL through a wide range of capabilities from reinforcement learning (RL) and deep learning (DL) for handling sophisticated dynamic business environments offers vast opportunities. DRL is characterized by scalability with the potential to be applied to high-dimensional problems in conjunction with noisy and nonlinear patterns of economic data. In this work, we first consider a brief review of DL, RL, and deep RL methods in diverse applications in economics providing an in-depth insight into the state of the art. Furthermore, the architecture of DRL applied to economic applications is investigated in order to highlight the complexity, robustness, accuracy, performance, computational tasks, risk constraints, and profitability. The survey results indicate that DRL can provide better performance and higher accuracy as compared to the traditional algorithms while facing real economic problems at the presence of risk parameters and the ever-increasing uncertainties.
ARTICLE | doi:10.20944/preprints202209.0483.v1
Subject: Engineering, Control & Systems Engineering Keywords: deep reinforcement learning; data efficient; curriculum learning; transfer learning
Online: 30 September 2022 (10:35:06 CEST)
Sparse reward long horizon task is a major challenge for deep reinforcement learning algorithm. One of the key barriers is data-inefficiency. Even in the simulation environment, it usually takes weeks to training the agent. In this study, a data-efficiency training framework is proposed, where a curriculum learning is design for the agent in the simulation scenario. Different distributions of the initial state are set for the agent to get more informative reward during the whole training process. A fine-tuning of the parameters in the output layer of the neural network for value function is conduct to bridge the gap between sim-to-real. An experiment of UAV maneuver control is conducted in the proposed training framework to verify the method more efficient. We demonstrate that data-efficiency is different for the same data in different training stages.
ARTICLE | doi:10.20944/preprints201909.0159.v1
Subject: Engineering, Electrical & Electronic Engineering Keywords: radio over fiber; nonlinearities mitigation; reinforcement learning (RL) method
Online: 16 September 2019 (10:37:01 CEST)
We propose a 10-Gb/s 64-quadrature amplitude modulation (QAM) signal-based Radio over Fiber (RoF) system for 50 km of standard single mode fiber length which utilizes Reinforcement Learning (RL) SARSA based decision method to indicate an effective decision which mitigates nonlinearity. By utilizing RL-SARSA algorithm, the results demonstrate that significant reduction can be obtained in terms of bit error rate.
ARTICLE | doi:10.20944/preprints202101.0176.v1
Subject: Mathematics & Computer Science, Algebra & Number Theory Keywords: recommender system; tag-ware; deep reinforcement learning; user cold start
Online: 11 January 2021 (10:02:49 CET)
Recently, the application of deep reinforcement learning in recommender system is flourishing and stands out by overcoming drawbacks of traditional methods and achieving high recommendation quality. The dynamics, long-term returns and sparse data issues in recommender system have been effectively solved. But the application of deep reinforcement learning brings problems of interpretability, overfitting, complex reward function design, and user cold start. This paper proposed a tag-aware recommender system based on deep reinforcement learning without complex function design, taking advantage of tags to make up for the interpretability problems existing in recommender system. Our experiment is carried out on MovieLens dataset. The result shows that, DRL based recommender system is superior than traditional algorithms in minimum error and the application of tags has little effect on accuracy when making up for interpretability. In addition, DRL based recommender system has excellent performance on user cold start problems.
ARTICLE | doi:10.20944/preprints202006.0046.v2
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: Postural Balance; Deep Reinforcement Learning; Postural Stabilisation; Biomechanics
Online: 8 June 2020 (10:25:54 CEST)
Learning to maintain postural balance while standing requires a significant fine coordination effort between the neuromuscular system and the sensory system. It is one of the key contributing factors towards fall prevention, especially in the older population. Using artificial intelligence (AI), we can similarly teach an agent to maintain a standing posture, and thus teach the agent not to fall. In this paper, we investigate the learning progress of an AI agent and how it maintains a stable standing posture through reinforcement learning. During training, the AI agent learnt three policies. First, it learnt to maintain the Centre-of-Gravity and Zero-Moment-Point in front of the body. Then, it learnt to shift the load of the entire body on one leg while using the other leg for fine tuning the balancing action. Finally, it started to learn the coordination between the two pre-trained policies. This study shows the potentials of using deep reinforcement learning in human movement studies. The learnt AI behaviour also exhibited attempts to achieve an unplanned goal because it correlated with the set goal (e.g. walking in order to prevent falling). The failed attempts to maintain a standing posture is an interesting by-product which can enrich the fall detection and prevention research efforts.
ARTICLE | doi:10.20944/preprints202103.0592.v1
Subject: Engineering, Electrical & Electronic Engineering Keywords: Electric Vehicles; batch reinforcement learning; dueling neural networks; fitted Q-iteration
Online: 24 March 2021 (13:44:36 CET)
We consider the problem of coordinating the charging of an entire fleet of electric vehicles (EV), using a model-free approach, i.e. purely data-driven reinforcement learning (RL). The objective of the RL-based control is to optimize charging actions, while fulfilling all EV charging constraints (e.g. timely completion of the charging). In particular, we focus on batch-mode learning and adopt fitted Q-iteration (FQI). A core component in FQI is approximating the Q-function using a regression technique, from which the policy is derived. Recently, a dueling neural networks architecture was proposed and shown to lead to better policy evaluation in the presence of many similar-valued actions, as applied in a computer game context. The main research contributions of the current paper are that (i)we develop a dueling neural networks approach for the setting of joint coordination of an entire EV fleet, and (ii)we evaluate its performance and compare it to an all-knowing benchmark and an FQI approach using EXTRA trees regression technique, a popular approach currently discussed in EV related works. We present a case study where RL agents are trained with an epsilon-greedy approach for different objectives, (a)cost minimization, and (b)maximization of self-consumption of local renewable energy sources. Our results indicate that RL agents achieve significant cost reductions (70--80%) compared to a business-as-usual scenario without smart charging. Comparing the dueling neural networks regression to EXTRA trees indicates that for our case study's EV fleet parameters and training scenario, the EXTRA trees-based agents achieve higher performance in terms of both lower costs (or higher self-consumption) and stronger robustness, i.e. less variation among trained agents. This suggests that adopting dueling neural networks in this EV setting is not particularly beneficial as opposed to the Atari game context from where this idea originated.
ARTICLE | doi:10.20944/preprints202203.0199.v1
Subject: Engineering, Control & Systems Engineering Keywords: micropositioners; reinforcement learning; disturbance observer; deep deterministic policy gradient
Online: 15 March 2022 (07:58:27 CET)
The robust control of high precision electromechanical systems, such as micropositioners, is challenging in terms of the inherent high nonlinearity, the sensitivity to external interference, and the complexity of accurate identification of the model parameters. To cope with these problems, this work investigates a disturbance observer-based deep reinforcement learning control strategy to realize high robustness and precise tracking performance. Reinforcement learning has shown great potential as optimal control scheme, however, its application in micropositioning systems is still rare. Therefore, embedded with the integral differential compensator (ID), deep deterministic policy gradient (DDPG) is utilized in this work with the ability to not only decrease the state error but also improves the transient response speed. In addition, an adaptive sliding mode disturbance observer (ASMDO) is proposed to further eliminate the collective effect caused by the lumped disturbances. The sterling performance is revealed with intensive tracking simulation experiments and demonstrates the improvement in the accuracy and response time of the controller.
REVIEW | doi:10.20944/preprints202201.0050.v1
Subject: Engineering, Mechanical Engineering Keywords: turbulence; flow control; simulation; aerodynamics; machine learning; deep reinforcement learning
Online: 6 January 2022 (09:36:50 CET)
In this review we summarize existing trends of flow control used to improve the aerodynamic efficiency of wings. We first discuss active methods to control turbulence, starting with flat-plate geometries and building towards the more complicated flow around wings. Then, we discuss active approaches to control separation, a crucial aspect towards achieving high aerodynamic efficiency. Furthermore, we highlight methods relying on turbulence simulation, and discuss various levels of modelling. Finally, we thoroughly revise data-driven methods, their application to flow control, and focus on deep reinforcement learning (DRL). We conclude that this methodology has the potential to discover novel control strategies in complex turbulent flows of aerodynamic relevance.
ARTICLE | doi:10.20944/preprints202206.0028.v1
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: reinforcement learning (RL); online learning; mobile health; algorithm design; algorithm evaluation; decision support systems
Online: 2 June 2022 (06:05:06 CEST)
Online reinforcement learning (RL) algorithms are increasingly used to personalize digital interventions in the fields of mobile health and online education. Common challenges in designing and testing an RL algorithm in these settings include ensuring the RL algorithm can learn and run stably under real-time constraints, and accounting for the complexity of the environment, e.g., a lack of accurate mechanistic models for the user dynamics. To guide how one can tackle these challenges, we extend the PCS (Predictability, Computability, Stability) framework, a data science framework that incorporates best practices from machine learning and statistics in supervised learning (Yu and Kumbier, 2020), to the design of RL algorithms for the digital interventions setting. Further, we provide guidelines on how to design simulation environments, a crucial tool for evaluating RL candidate algorithms using the PCS framework. We illustrate the use of the PCS framework for designing an RL algorithm for Oralytics, a mobile health study aiming to improve users' tooth-brushing behaviors through the personalized delivery of intervention messages. Oralytics will go into the field in late 2022.
REVIEW | doi:10.20944/preprints202208.0104.v2
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: HEMS; Reinforcement Learning; Deep Neural Network; Q-Value; Policy Gradient; Natural Gradient; Actor-Critic; Residential, Commercial, Academic.
Online: 1 September 2022 (04:27:12 CEST)
The steep rise in reinforcement learning (RL) in various applications in energy as well as the penetration of home automation in recent years are the motivation for this article. It surveys the use of RL in various home energy management system (HEMS) applications. There is a focus on deep neural network (DNN) models in RL. The article provides an overview of reinforcement learning. This is followed with discussions on state-of-the-art methods for value, policy, and actor–critic methods in deep reinforcement learning (DRL). In order to make the published literature in reinforcement learning more accessible to the HEMS community, verbal descriptions are accompanied with explanatory figures as well as mathematical expressions using standard machine learning terminology. Next, a detailed survey of how reinforcement learning is used in different HEMS domains is described. The survey also considers what kind of reinforcement learning algorithms are used in each HEMS application. It suggests that research in this direction is still in its infancy. Lastly, the article proposes four performance metrics to evaluate RL methods.
ARTICLE | doi:10.20944/preprints202010.0413.v1
Subject: Engineering, Civil Engineering Keywords: Real-time Control; Reinforcement Learning; Smart Stormwater Systems; Urban Flooding
Online: 20 October 2020 (15:03:45 CEST)
Climate change and development have increased urban flooding, requiring modernization of stormwater infrastructure. Retrofitting standard passive systems with controllable valves/pumps is promising, but requires real-time control (RTC). One method of automating RTC is reinforcement learning (RL), a general technique for sequential optimization and control in uncertain environments. The notion is that an RL algorithm can use inputs of real-time flood data and rainfall forecasts to learn a policy for controlling the stormwater infrastructure to minimize measures of flooding. In real-world conditions, rainfall forecasts and other state information, are subject to noise and uncertainty. To account for these characteristics of the problem data, we implemented Deep Deterministic Policy Gradient (DDPG), an RL algorithm that is distinguished by its capability to handle noise in the input data. DDPG implementations were trained and tested against a passive flood control policy. Three primary cases were studied: (i) perfect data, (ii) imperfect rainfall forecasts, and (iii) imperfect water level and forecast data. Rainfall episodes (100) that caused flooding in the passive system were selected from 10 years of observations in Norfolk, Virginia, USA; 85 randomly selected episodes were used for training and the remaining 15 unseen episodes served as test cases. Compared to the passive system, all RL implementations reduced flooding volume by 70.5% on average, and performed within a range of 5%. This suggests that DDPG is robust to noisy input data, which is essential knowledge to advance the real-world applicability of RL for stormwater RTC.
ARTICLE | doi:10.20944/preprints201805.0353.v1
Subject: Mathematics & Computer Science, General & Theoretical Computer Science Keywords: big data; big data system; energy; district heating; reinforcement learning
Online: 24 May 2018 (16:05:27 CEST)
This paper presents a study on the thermal efficiency improvement of the user equipment room in the district heating system based on reinforcement learning , and suggests a general method of constructing a learning network(DQN) using deep Q learning, which is a reinforcement learning algorithm that does not specify a model. In addition, we introduce the big data platform system and the integrated heat management system for the energy field in the massive data processing from the IoT sensor installed in large number of thermal energy control facilities.
ARTICLE | doi:10.20944/preprints201808.0049.v1
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: intelligent driving vehicle; trajectory planning; end-to-end; deep reinforcement learning; model transfer
Online: 2 August 2018 (13:06:39 CEST)
Aiming at the problem of model error and tracking dependence in the process of intelligent vehicle motion planning, an intelligent vehicle model transfer trajectory planning method based on deep reinforcement learning is proposed, which obtain an effective control action sequence directly. Firstly, an abstract model of the real environment is extracted. On this basis, Deep Deterministic Policy Gradient (DDPG) and vehicle dynamic model are adopted to jointly train a reinforcement learning model, and to decide the optimal intelligent driving maneuver. Secondly, the actual scene is transferred to equivalent virtual abstract scene by transfer model, furthermore, the control action and trajectory sequences are calculated according to trained deep reinforcement learning model. Thirdly, the optimal trajectory sequence is selected according to evaluation function in the real environment. Finally, the results demonstrate that the proposed method can deal with the problem of intelligent vehicle trajectory planning for continuous input and continuous output. The model transfer method improves the model generalization performance. Compared with the traditional trajectory planning, the proposed method output continuous rotation angle control sequence, meanwhile, the lateral control error is also reduced.
REVIEW | doi:10.20944/preprints202111.0044.v1
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: deep reinforcement learning; model-based RL; hierarchy; trading; cryptocurrency; foreign exchange; stock market; risk; prediction; reward shaping
Online: 2 November 2021 (10:57:23 CET)
Deep reinforcement learning (DRL) has achieved significant results in many Machine Learning (ML) benchmarks. In this short survey we provide an overview of DRL applied to trading on financial markets, including a short meta-analysis using Google Scholar, with an emphasis on using hierarchy for dividing the problem space as well as using model-based RL to learn a world model of the trading environment which can be used for prediction. In addition, multiple risk measures are defined and discussed, which not only provide a way of quantifying the performance of various algorithms, but they can also act as (dense) reward-shaping mechanisms for the agent. We discuss in detail the various state representations used for financial markets, which we consider critical for the success and efficiency of such DRL agents. The market in focus for this survey is the cryptocurrency market.
ARTICLE | doi:10.20944/preprints202111.0514.v1
Subject: Engineering, Electrical & Electronic Engineering Keywords: millimeter bands; fifth Generation; Handover; Deep Reinforcement Learning; and Jump Markov Linear System
Online: 29 November 2021 (07:50:19 CET)
The fifth Generation (5G) mobile networks use millimeter Waves (mmWaves) to offer giga bit data rates. However, unlike microwaves, mmWave links are prone to user and topographic dynamics. They easily get blocked and end up forming irregular cell patterns for 5G. This in turn cause too early, too late, or wrong handoffs (HOs). To mitigate HO challenges, sustain connectivity and avert unnecessary HO, we propose a HO scheme based on Jump Markov Linear System (JMLS) and Deep Reinforcement Learning (DRL). JMLS is widely known to account for abrupt changes in system dynamics. DRL likewise emerges as an artificial intelligence technique for learning highly dimensional and time-varying behaviors. We combine the two techniques to account for time-varying, abrupt, and irregular changes in mmWave link behaviour by predicting likely deterioration patterns of target links. The prediction is optimized by meta training techniques that also reduces training sample size. Thus, the JMLS-DRL platform formulates intelligent and versatile HO policies for 5G. Results show our proposed prediction scheme about target link behavior post HO to be highly reliable. The scheme also averts unnecessary HOs thus ably supports longer dew time.
ARTICLE | doi:10.20944/preprints202203.0161.v1
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: multi-agent systems; multi-agent reinforcement learning; internet of vehicles; urban area
Online: 11 March 2022 (05:13:15 CET)
Smart Internet of Vehicles (IoVs) combined with Artificial Intelligence (AI) will contribute to vehicle decision-making in the Intelligent Transportation System (ITS). Multi-Vehicle Pursuit games (MVP), a multi-vehicle cooperative ability to capture mobile targets, is becoming a hot research topic gradually. Although there are some achievements in the field of MVP in the open space environment, the urban area brings complicated road structures and restricted moving spaces as challenges to the resolution of MVP games. We define an Observation-constrained MVP (OMVP) problem in this paper and propose a Transformer-based Time and Team Reinforcement Learning scheme (T3OMVP) to address the problem. First, a new multi-vehicle pursuit model is constructed based on decentralized partially observed Markov decision processes (Dec-POMDP) to instantiate this problem. Second, by introducing and modifying the transformer-based observation sequence, QMIX is redefined to adapt to the complicated road structure, restricted moving spaces and constrained observations, so as to control vehicles to pursue the target combining the vehicle’s observations. Third, a multi-intersection urban environment is built to verify the proposed scheme. Extensive experimental results demonstrate that the proposed T3OMVP scheme achieves significant improvements relative to state-of-the-art QMIX approaches by 9.66%~106.25%. Code is available at https://github.com/pipihaiziguai/T3OMVP.
COMMUNICATION | doi:10.20944/preprints202104.0575.v2
Subject: Mathematics & Computer Science, Artificial Intelligence & Robotics Keywords: Ethnopharmacology; Artificial Intelligence; Web Crawling; Active Learning; Reinforcement Learning; Text Mining; Big Data
Online: 23 June 2021 (11:47:32 CEST)
Ethnopharmacology experts face several challenges when identifying and retrieving documents and resources related to their scientific focus. The volume of sources that need to be monitored, the variety of formats utilized, the different quality of language use across sources, present some of what we call “big data” challenges in the analysis of this data. This study aims to understand if and how experts can be supported effectively through intelligent tools in the task of ethnopharmacological literature research. To this end, we utilize a real case study of ethnopharmacology research, aimed at the Southern Balkans and Coastal zone of Asia Minor. Thus, we propose a methodology for more efficient research in ethnopharmacology. Our work follows an “Expert-Apprentice” paradigm in an automatic URL extraction process, through crawling, where the apprentice is a Machine Learning (ML) algorithm, utilizing a combination of Active Learning (AL) and Reinforcement Learning (RL), and the Expert is the human researcher. ML-powered research improved 3.1 times the effectiveness and 5.14 times the efficiency of the domain expert, fetching a total number of 420 relevant ethnopharmacological documents in only 7 hours versus an estimated 36-hour human-expert effort. Therefore, utilizing Artificial Intelligence (AI) tools to support the researcher can boost the efficiency and effectiveness of the identification and retrieval of appropriate documents.
ARTICLE | doi:10.20944/preprints202207.0110.v1
Subject: Mathematics & Computer Science, Applied Mathematics Keywords: associative learning; molecular circuits; synthetic biology; mathematical modeling; Hill equation; Pavlov’s dog; reinforcement; dissociation; non-dimensionalization
Online: 7 July 2022 (04:38:20 CEST)
The development of synthetic biology has enabled us to make massive progress on biotechnology and to approach research questions from a brand new perspective. In particular, the design and study of gene regulatory networks in vitro, in vivo and in silico, have played an increasingly indispensable role in understanding and controlling biological phenomena. Among them, it is of great interest to understand how associative learning is formed at the molecular circuit level. Noticeably, mathematical models have been increasingly used to predict the behaviors of molecular circuits. The Fernando’s model, which is thought to be one of the first works in this line of research using the Hill equation, attempted to design a synthetic circuit that mimics Hebbian learning in the neural network architecture. In this article, we carry out in-depth computational analysis of the model and demonstrate that the reinforcement effect can be achieved by choosing the proper parameter values. We also construct a novel circuit that can demonstrate forced dissociation, which was not observed in the Fernando’s model. Our work can be readily used as reference for synthetic biologists who consider implementing the circuits of this kind in biological systems.
ARTICLE | doi:10.20944/preprints202201.0147.v1
Subject: Engineering, Civil Engineering Keywords: hybrid FRP-steel reinforcement; ductility; hybrid reinforcement ratio; fiber element; neutral axis
Online: 11 January 2022 (14:04:26 CET)
An experimental study was carried out to evaluate the ductility of reinforced concrete beams longitudinally reinforced with hybrid FRP-Steel bars. The specimens were fourteen reinforced concrete beams with and without hybrid reinforcement. The test variables were bars position, the ratio of longitudinal reinforcement, and the type of FRP bars. The beams were loaded up to failure using a four-point bending test. The performance of the tested beams was observed using the load-deflection curve obtained from the test. Numerical analysis using the fiber element model was used to examine the growth of neutral axis depth due to the effect of test variables. The neutral axis curves were then used to further estimate the neutral axis angle and neutral axis displacement index. The test results show that the position of the reinforcement greatly influences the flexural behavior of the beam with hybrid reinforcement. It was observed from the test that the flexural capacity of beams with hybrid reinforcement is 4% to 50% higher than that of the beams with conventional steel bars depending on bars position and the ratio of longitudinal reinforcement. The ductility decreases as the hybrid reinforcement ratio (Af/As) increases. This study also showed that a numerical model developed can predict the flexural behavior of beams with hybrid reinforcement with reasonable accuracy.
Subject: Mathematics & Computer Science, Other Keywords: reinforcement learning; bitrate streaming; world-models; video streaming; model-based reinforcement learning
Online: 20 August 2020 (07:02:57 CEST)
Adaptive bitrate (ABR) algorithms optimize the quality of streaming experiences for users in client-side video players especially in unreliable or slow mobile networks. Several rule-based heuristic algorithms can achieve stable performance, but they sometimes fail to adapt properly to changing network conditions. Fluctuating bandwidth may cause algorithms to default to behavior that creates a negative experience for the user. ABR algorithms can be generated with reinforcement learning, a decision-making paradigm in which an agent learns to make optimal choices through interactions with an environment. Training reinforcement learning algorithms for bitrate streaming requires building a simulator for an agent to experience interactions quickly; training an agent in the real environment is infeasible due to the long step times in real environments. This project explores using supervised learning to construct a world-model, or a learned simulator, from recorded interactions. A reinforcement learning agent trained inside of the learned model, rather than a simulator, can outperform rule-based heuristics. Furthermore, agents trained inside the learned world-model can outperform model-free agents in low sample regimes. This work highlights the potential for world-models to quickly learn simulators, and to be used to generate optimal policies.
ARTICLE | doi:10.20944/preprints202010.0156.v1
Subject: Keywords: Artificial intelligence; Deep reinforcement learning; Demand Response; Dynamic pricing; Energy management system; Microgrid; Neural networks; Price-responsive loads; Smart grid; Thermostatically controlled loads
Online: 7 October 2020 (11:21:03 CEST)
In this paper, we study the performance of various deep reinforcement learning algorithms to enhance the energy management system of a microgrid. We propose a novel microgrid model that consists of a wind turbine generator, an energy storage system, a set of thermostatically controlled loads, a set of price-responsive loads, and a connection to the main grid. The proposed energy management system is designed to coordinate among the different flexible sources by defining the priority resources, direct demand control signals, and electricity prices. Seven deep reinforcement learning algorithms were implemented and are empirically compared in this paper. The numerical results show that the deep reinforcement learning algorithms differ widely in their ability to converge to optimal policies. By adding an experience replay and a semi-deterministic training phase to the well-known asynchronous advantage actor-critic algorithm, we achieved the highest model performance as well as convergence to near-optimal policies.
ARTICLE | doi:10.20944/preprints202207.0461.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Malaria; digital; epidemic; mixed infections; reinforcement
Online: 29 July 2022 (11:25:46 CEST)
Malaria is a long-standing disease and one of the top life-threatening diseases, yet its treatment has not changed, while the world has already embraced the Fourth Industrial Revolution (4IR). A wave of research on digitizing monitoring mechanisms of such a deadly disease has surfaced. Automated malaria screening is one of the detection processes which are gaining popularity in the research domain. However, the process needs to be coupled with other processes aiming a nationally or regionally contextualised malaria monitoring system. This paper proposes a digital malaria monitoring system in the context of an African country or region. One advantage of such a digital system is that is enables a novel disease spread forecasting model based on the dynamics of different malaria types. The architecture of the diagnosis system is described, and the disease spread model is mathematically modelled in terms of a SPITR (Susceptible- Protected- Infected-Treated- Recovered) epidemic model which is further analysed. The forecasting model is expressed and analysed whereas experiments are conducted using a Monte Carlo simulation method. The design of the monitoring system has inspired how predictions can be made in the complex cases such as mixed infections. Results show that reinforcing the model parameter makes a significant improvement on the disease prediction.
ARTICLE | doi:10.20944/preprints202110.0408.v1
Subject: Materials Science, Polymers & Plastics Keywords: PA66; PA66GF; weaves; reinforcement; overmoulding; composites
Online: 27 October 2021 (12:43:39 CEST)
The need to develop novel lightweight materials and their manufacturing processes is sets out to meet the new aerospace, automotive and construction requirements. Within this context, this research work is proposed to develop a novel thermoplastic composite material with high mechanical properties. These composites will be based on thermoplastic matrixes made from polyamide and 35% short glass fiber filled-polyamide reinforced with different types of fabrics. As reinforcement, glass fiber fabrics will be used as the base. They will be treated with different processes, both chemical and physical, to promote adherence to the matrix. Textile overmoulding technology was selected for manufacturing these composites. This technology was primarily developed to manufacture aesthetic lined components and has achieved a great implantation. Once these new composites are manufactured, they will be submitted to different tests to evaluate their behavior regarding adhesion, impact strength and stiffness. It is expected an improvement on stiffness and impact absorption.
REVIEW | doi:10.20944/preprints201804.0019.v1
Subject: Engineering, Civil Engineering Keywords: building rehabilitation; energy efficiency; seismic reinforcement
Online: 2 April 2018 (10:26:38 CEST)
Most European cities are characterized by very large areas, often formed by buildings with low quality, from a series of point of view. The possibility of renovating them is strategic to improve both the quality of life and to the possibility of economic recovery for building companies. In the last decades, the attention of the scientific community has been addressed to the energy renovation, thanks to the strong activities of the European Community in this field. However, since a relevant part of the EC territory is at risk of earthquake, the possibility to combine both energy and seismic renovation actions may be strategic for many countries. In particular, Italy and Romania are linked by a common social tradition that springs from the Roman Empire. Nowadays, this link is stronger, thanks to common interests in social, cultural and business fields. Therefore, the investigation of possible synergies for seismic and energy renovation strategies may be really interesting for both countries. This paper represents the first step in this direction. After an overview of regulations and common practices for buildings with reinforced concrete structures, in both states, some key combined renovation interventions will be described and discussed, as well as advantages and perspectives of integrated renovation approaches.
ARTICLE | doi:10.20944/preprints201704.0118.v2
Subject: Engineering, Civil Engineering Keywords: bond; concrete; reinforcement; damage-plasticity; failure
Online: 25 August 2017 (08:01:21 CEST)
The structural performance of reinforced concrete relies heavily on the bond between reinforcement and concrete. In nonlinear finite element analyses, bond is either modelled by merged, also called perfect bond, or coincident with slip, also called bond-slip, approaches. Here, the performance of these two approaches for the modelling of failure of reinforced concrete was investigated using a damage-plasticity constitutive model in LS-DYNA. Firstly, the influence of element size on the response of tension-stiffening analyses with the two modelling approaches was investigated. Then, the results of the two approaches were compared for plain and fibre reinforced tension stiffening and a drop weight impact test. It was shown that only the coincident with slip approach provided mesh insensitive results. However, both approaches were capable of reproducing the overall response of the experiments in the form of load and displacements satisfactorily for the meshes used.
ARTICLE | doi:10.20944/preprints202101.0115.v1
Subject: Physical Sciences, Acoustics Keywords: machine learning; virtual diagnostics; reinforcement learning control
Online: 6 January 2021 (11:58:41 CET)
We discuss the implementation of a suite of virtual diagnostics at the FACET-II facility currently under commissioning at SLAC National Accelerator Laboratory. The diagnostics will be used for prediction of the longitudinal phase space along the linac, spectral reconstruction of the bunch profile and non-destructive inference of transverse beam quality (emittance) using edge radiation at the injector dogleg and bunch compressor locations. These measurements will be folded in to adaptive feedbacks and ML-based reinforcement learning controls to improve the stability and optimize the performance of the machine for different experimental configurations. In this paper we describe each of these diagnostics with expected measurement results based on simulation data and discuss progress towards implementation in regular operations.
ARTICLE | doi:10.20944/preprints201805.0348.v1
Subject: Engineering, Mechanical Engineering Keywords: failure criteria; curauá fibers; reinforcement direction; ANOVA.
Online: 24 May 2018 (10:23:36 CEST)
Natural fibers are being increasingly used in different areas of engineering, including as composite reinforcement. Among these fibers, carauá stands out for its good mechanical properties and adherence to resin. Nevertheless, little is known about the behavior of this material in the manufacture of a composite or whether classic failure theory can be used in this case. In this context, the present study assesses the mechanical properties of two laminas made of unidirectional curauá fiber with volumetric fiber percentages of 30 % and 22 %, and compares the results with the values obtained for four failure criteria reported in the literature, using analysis of variance (ANOVA). To that end, tensile tests were conducted in the direction of the fiber and at other loading angles, in addition to iosipescu shear tests. The results show that the maximum stress criterion does not represent the failure behavior of these materials and that the best was the Hashin criterion.
ARTICLE | doi:10.20944/preprints201608.0220.v1
Subject: Materials Science, Polymers & Plastics Keywords: composite fibers; flexural strength; polyester matrix; reinforcement
Online: 29 August 2016 (10:46:01 CEST)
Composite fiber materials are superior materials due to their high strength and light weight. Composites reflect the properties of their constituents, which is proportional to the volume fraction of each phase. There are different fiber reinforcement types and each affects it’s flexural, tensile and compression strength. When selecting a composite for a specific application, the forces excreted on the composite must be known in order to determine the reinforcement type. Unidirectional fiber reinforcement will allow very strong load resistance but only in one direction where as a random orientated fiber reinforcement can resist less load but can maintain this quota in all directions. These materials are said to be anisotropic. Certain composite fibers, taking into consideration there weights, are physically stronger than conventional metals. This research deals with the analysis of three composite materials with different reinforcement types, volume fraction and phase content. It was found that material A (glass epoxy) was the strongest in the longitudinal direction with a flexural strength of 534 MPa in the longitudinal and 420 MPa in the transverse direction. The flexural stresses of material B (glass silicone) and material C (glass polyester) where both found in the 120 to 135 MPa range. Differences were due to their differences in matrix composition and reinforcement type.
ARTICLE | doi:10.20944/preprints202203.0119.v1
Subject: Engineering, Automotive Engineering Keywords: smart scheduling; smart reservations; reinforcement learning; electric vehicle charging; electric vehicle charging management platform; neural network; DQN reinforcement Learning algorithm
Online: 8 March 2022 (08:54:48 CET)
The widespread adoption of electromobility constitutes one of the measures designed to reduce air pollution caused by traditional fossil fuels. However, several factors are currently impending this process, ranging from insufficient charging infrastructure, battery capacity, long queueing and charging time, to psychological factors. On top of range anxiety, the frustration of the EV drivers is further fueled by the lack the uncertainty of finding an available charging point on their route. To address this issue, we propose a solution that comes to bypass the limitations of the Reserve now function of the OCPP standard, enabling drivers to make charging reservations for the upcoming days, especially when planning a longer trip. We created an algorithm that generates reservation intervals based on the charging station's reservation and transaction history. Subsequently, we ran a series of test cases that yielded promising results, with no overlapping reservations.
ARTICLE | doi:10.20944/preprints201912.0229.v1
Subject: Engineering, Civil Engineering Keywords: masonry structures; stiffening walls; wall joints; connectors; bed joint reinforcement
Online: 17 December 2019 (10:46:57 CET)
Joints between walls are very important for structural analysis of each masonry building at the global and local level. This issue was often neglected in case of traditional joints and relatively squat walls. Nowadays the issue of wall joints is becoming particularly important due to the continuous strive for simplifying structures, introducing new technologies and materials. Eurocode 6 and other standards (USA, Canadian, Chinese, and Japanese) recommend inspecting joints between walls, but no detail procedures have been specified. This paper presents our own tests on joints between walls made of autoclaved aerated concrete (AAC) masonry units. Tests included reference models composed of two wall panels joined perpendicularly with a masonry bond (6 models), traditional steel and modified connectors (12 models). A shape and size of test models and structure of a test stand were determined on the basis of the analysis of the current knowledge, pilot studies and numerical analyses of FEM. The analysis referred to the morphology and failure mechanism of models. Load-displacement relationships for different types of joints were compared and obtained results were referred to results for reference models. A mechanism of cracking and failure was found to vary, and clear differences in behaviour and load capacity of each type of joints were observed. Individual working phases of joints were determined and defined, and the empirical approach was suggested to determine forces and displacement of wall joints.
ARTICLE | doi:10.20944/preprints201804.0021.v1
Subject: Social Sciences, Economics Keywords: interbank market; contagion risk; multi-agent system; reinforcement learning agents
Online: 2 April 2018 (10:51:49 CEST)
In this study, we examine the relationship of bank level lending and borrowing decisions and the risk preferences on the dynamics of the interbank lending We develop an agent-based model that incorporates individual bank decisions using the temporal difference reinforcement learning algorithm with empirical data of 6600 S. banks. The model can successfully replicate the key characteristics of interbank lending and borrowing relationships documented in the recent literatur A key finding of this study is that risk preferences at individual bank level can lead to unique interbank market structures which are suggestive of the capacity that the market responds to surprising
ARTICLE | doi:10.20944/preprints202107.0545.v1
Subject: Mathematics & Computer Science, Algebra & Number Theory Keywords: UAVs; Wireless Power Transfer; RF energy harvesting, MIMO; Deep Reinforcement Learning.
Online: 23 July 2021 (13:29:51 CEST)
The Unmanned Aerial Vehicles (UAVs), used in civilian applications such as emergency medical deliveries, precision agriculture, wireless communication provisioning, etc., face the challenge of limited flight time due to their reliance on the on-board battery. Therefore, developing efficient mechanisms for in-situ power transfer to recharge UAV batteries hold potential in extending their mission time. In this paper, we study the use of far-field wireless power transfer (WPT) technique from specialized, transmitter UAVs (tUAVs) carrying Multiple Input Multiple Output (MIMO) antennas for transferring wireless power to receiver UAVs (rUAVs) in a mission. The tUAVs can fly and adjust their distance to the rUAVs to maximize energy transfer. The use of MIMO antennas further boost the energy reception by narrowing the energy beam toward the rUAVs. The complexity of their dynamic operating environment increases with the growing number of tUAVs, and rUAVs with varying levels of energy consumption and residual power. We propose an intelligent trajectory selection algorithm for the tUAVs based on a deep reinforcement learning model called Proximal Policy Optimization (PPO) to optimize the energy transfer gain. Simulation results demonstrate that with the use of PPO, the system achieves a tenfold flight time extension compared to no wireless recharging. Further, PPO outperforms the benchmark movement strategies of ’Traveling Salesman Problem’ and ’Low Battery First’ when used by the tUAVs.
ARTICLE | doi:10.20944/preprints201806.0127.v1
Subject: Engineering, Energy & Fuel Technology Keywords: Complicated structural region; Directional drilling; Grouting reinforcement; Coal floor; Karst aquifer
Online: 7 June 2018 (16:07:36 CEST)
Water inrush from coal floor constitutes one of the main disasters in mine construction and mine production, which always brings high risks and losses to the coal mine safe production. As the mining depth of coal fields in North China gradually increased, especially in the complicated structural region, the threat posed by limestone karstic water of coal floor to the safe stoping of mines has become increasingly prominent. In this paper, the Taoyuan coalmine was taken as an example, for which, the directional borehole grouting technology was utilized to reinforce the coal seam floor prior to mining. Also, the factors affecting the grouting effect were analyzed. These were the geological structure, the crustal stress and the range of slurry diffusion. The layout principle of grouting drilling was put forward and the directional drilling structure was designed. The water level observations in the end hole indicated that the target stratum was accurate and reliable. The effect of grouting was validated through the audio frequency electric perspective method and the holedrilling in the track trough. The results demonstrated that the effect of grouting in third limestone and the rock stratum above the third limestone of coal seam floor was apparent. Simultaneously, no water inrush occurred following the actual mining of the working face, which further demonstrated that the grouting reinforcement effect was apparent. The research findings were of high significance for the prevention and control of floor water disaster and water conservation in deep complex structural areas.
ARTICLE | doi:10.20944/preprints202109.0177.v1
Subject: Engineering, Electrical & Electronic Engineering Keywords: Sustainable wireless connectivity; Energy saving; UAV; Communication system; 5G; Positioning; Reinforcement learning
Online: 9 September 2021 (11:28:38 CEST)
Unmanned aerial vehicles (UAVs)-based communication system is a promising solution to meet coverage and capacity requirements of future wireless networks. However, UAV-enabled communications is constrained with its coverage, energy consumption, and flying regulations, and the number of works focusing on the sustainability aspect of UAV-assisted networking has been limited in the literature so far. In this paper, we propose a solution to this limitation; particularly, we design a $Q$-learning-based UAV positioning scheme for sustainable wireless connectivity considering key constraints, that are, altitude regulations, non-flight zones, and transmit power. The objective is to find the optimal position of the UAV base station (BS) and minimize the energy consumption while maximizing the number of users covered. Moreover, a weighting mechanism is developed, where the energy consumption and number of users covered can be prioritized according to network/battery conditions. The proposed Q-learning-based solution is compared to the baseline k-means clustering method, where the UAV BS is positioned at the centroid location that minimizes the cumulative distance between the UAV BS and the users. The results demonstrate that the proposed solution outperforms the baseline k-means clustering-based method in terms of the number of users covered while achieving the desired minimization of the energy consumption.
ARTICLE | doi:10.20944/preprints202011.0675.v1
Subject: Engineering, Automotive Engineering Keywords: fiber reinforcement; SHCC; ECC; impact testing; split Hopkinson tension bar; structural inertia
Online: 26 November 2020 (15:03:39 CET)
The performance of a normal-strength SHCC under impact loading was studied using the results obtained from a split Hopkinson tension bar (SHTB). The focus of the investigation is to explain the mechanisms behind the peculiar rate-dependent behavior of SHCC under tensile loading. With the help of frames obtained by high-speed cameras and the subsequent Digital Image Correlation (DIC) analysis, the stress-strain relation of the SHCC obtained in SHTB was analyzed. The investigation of the composite’s behavior was supported by constituent-level experiments on the non-reinforced matrix of the SHCC and on the fiber-matrix bond. In the case of the constituent matrix, the well-known apparent increase in the tensile strength of the cement-based matrix and its influence on the behavior of SHCC was studied. For this purpose, experiments on the SHCC specimens with different geometries were performed in the SHTB. The results obtained from these experiments and those obtained by DIC show that commonly used analytical models, in which the specimen is assumed elastic, cannot capture the effects of structural inertia on the results. Thus, an alternative novel method based on the results of DIC has been used to explain and quantify the contribution of structural inertia. The rate-dependent behavior of the fiber-matrix bond was studied by performing high-speed single fiber pullout tests in a miniaturized split Hopkinson tension bar. This novel experimental technique enabled explanation of the rate-dependent bridging action of the fibers in SHCC. Based on the results, the enhanced behavior of SHCC under impact loading is explained.
ARTICLE | doi:10.20944/preprints201901.0021.v1
Subject: Engineering, Civil Engineering Keywords: Rebar location, FRP reinforcement, NDT Methods, GPR Testing, Ultrasonic Testing, Electromagnetic Testing
Online: 3 January 2019 (13:10:29 CET)
An increasing use of non-metallic reinforcement is problematic as it has to be detected at the stage of accepting construction works, or later when expert opinions are prepared for the building. In contrast to metallic reinforcement, location of this type of reinforcement is difficult using non-destructive techniques. Small diameters of rebars and their location in a tested element were troublesome. This article describes an attempt to locate non-metallic reinforcement in a concrete element and the masonry. Tests were performed using an ultrasonic tomograph and GPR with a broad range of frequencies.
ARTICLE | doi:10.20944/preprints201809.0227.v1
Subject: Engineering, Electrical & Electronic Engineering Keywords: location-aware; cooperative anti-jamming; Markov decision process; Markove game; reinforcement learning
Online: 13 September 2018 (03:26:04 CEST)
This paper investigates the cooperative anti-jamming distributed channel selection problem in UAV communication networks. Considering the existence of malicious jamming and co-channel interference, a location-aware cooperative anti-jamming scheme is designed for the purpose of maximizing the users' utilities. Users in the UAV group cooperate with each other via location information sharing. When the received interference energy is lower than mutual interference threshold, users conduct channel selection strategies independently. Otherwise, users take joint actions with a cooperative anti-jamming pattern under the impact of mutual interference. Aimed at the independent anti-jamming channel selection problem under no mutual interference, a Markov Decision Process framework is introduced, whereas for the cooperative anti-jamming channel selection case under the influence of co-channel mutual interference, a Markov game framework is employed. Furthermore, motivated by reinforcement learning with a ``Cooperation-Decision-Feedback-Adjustment" idea, we design a location-aware cooperative anti-jamming distributed channel selection algorithm (LCADCSA) to obtain the optimal anti-jamming channel strategies for the users with a distributed way. In addition, the channel switching cost and cooperation cost, which have great impact on the users' utilities, are introduced. Finally, simulation results show that the proposed algorithm converges to a stable solution with which the UAV group can avoid the malicious jamming as well as co-channel interference effectively.
ARTICLE | doi:10.20944/preprints202201.0041.v1
Subject: Engineering, Civil Engineering Keywords: building remodeling; concentrated loads; FRP reinforcement; FRP strips; shear capacity, vertical concrete cantilever
Online: 5 January 2022 (13:01:16 CET)
Renovation, restoration, remodeling, refurbishment, and retrofitting of build-ings often imply modifying the behavior of the structural system. Modification sometimes includes applying forces (i.e., concentrated loads) to beams that before were subjected to distributed loads only. For a reinforced concrete structure, the new condition causes a beam to bear a concentrated load with the crack pattern that was produced by the distributed loads that acted in the past. If the concentrated load is applied at or near the beam’s midspan, the new shear demand reaches the maximum around the midspan. But around the midspan, the cracks are vertical or quasi-vertical, and no inclined bar is present. So, the actual shear capacity around the midspan not only is low, but also can be substantially lower than the new demand. In order to bring the beam capacity up to the demand, fiber-reinforced-polymer composites can be used. This paper presents a design method to increase the concentrated load-carrying capacity of reinforced concrete beams whose load distribution has to be changed from distributed to concentrated, and an analytical model to pre-dict the concentrated load-carrying capacity of a beam in the strengthened state.
ARTICLE | doi:10.20944/preprints202008.0534.v1
Subject: Engineering, Civil Engineering Keywords: jute fibre; reinforcement; modified compaction test; California bearing ratio test; stabilization; shear strength
Online: 25 August 2020 (03:30:40 CEST)
Abstract: This paper is focusing on the stabilisation of soil using jute fibre as soil stabilizer. Stabilisation is the process of modifying the properties of a soil to improve its engineering performance and used it for a variety of engineering works. This study examines the potential of soil stabilization with jute fibre when it is cut into roughly 30mm lengths as stabilizer. The varying percentages like 0.5%, 1%, 1.5 and 2% of pieces of jute fibre were used and mixed it with soil. The laboratory tests such as California Bearing Ratio (CBR) test, modified compaction tests and direct shear strength tests have been conducted to observe the change in engineering properties of soil. On the basis of the experiments performed, it can be concluded that the stabilization of soil using 30mm pieces of jute as stabilizer improves the strength characteristics of the soil so that it becomes usable as one of the reinforcing material for the construction of roadways, parking areas, site development projects, airports and many other situations where sub-soils are not suitable for construction.
ARTICLE | doi:10.20944/preprints201902.0193.v1
Subject: Engineering, Civil Engineering Keywords: retrofitting; earthquakes; masonry; historical buildings; active reinforcement; Mohr’s circles; CAM system; Φ system
Online: 20 February 2019 (12:18:11 CET)
The present paper deals with the retrofitting of unreinforced masonry (URM) buildings, subjected to in-plane shear and out of-plane loading when struck by an earthquake. After an introductive comparison between some of the latest punctual and continuous active retrofitting methods, the authors focused on the two most effective active continuous techniques, the CAM system and the Φ system, which also improve the box-type behavior of buildings. These two retrofitting systems allow us to increase both the static and dynamic load-bearing capacity of masonry buildings. Nevertheless, information on how they actually modify the stress field in static conditions is lacking and sometimes questionable, in the literature. Therefore, we performed a static analysis in the plane of Mohr/Coulomb, with the dual intent to clarify which of the two is preferable under static conditions and whether the models currently used to design the retrofitting systems are fully adequate.
ARTICLE | doi:10.20944/preprints202209.0387.v1
Subject: Engineering, Other Keywords: maritime autonomy; autonomous ship; safety; digital twin; deep reinforcement learning; collision avoidance; situational awareness
Online: 26 September 2022 (08:55:58 CEST)
The use of digital twins for the development of Autonomous Maritime Surface Vessels (AMSVs) has enormous potential to resolve the increasing need for water-based navigation and safety at the seas. Aiming at the problem of lack of broad and integrated digital twin implementations with live data along with the absence of a digital twin-driven framework for AMSV design and development, an application framework for the development of a fully autonomous vessel using an integrated digital twin in a 3D simulation environment has been presented. Our framework has four layers to ensure that the simulation and the real-world boat as well as the environment are as close as possible. Åboat, an experimental research platform for maritime automation and autonomous surface ship applications, equipped with two trolling electric motors, cameras, LiDARs, IMU and GPS has been used as the case study to provide a proof of concept. Åboat and its sensors, alongwith the environment have been replicated in a 3D simulation environment. Using the proposed application framework, we develop obstacle detection and path planning systems based on machine learning which leverage live data from a 3D simulation environment to mirror the complex dynamics of the real world.
ARTICLE | doi:10.20944/preprints201802.0049.v1
Subject: Materials Science, Polymers & Plastics Keywords: polymer-matrix composites; natural fiber reinforcement; interface/interphase; microstructural analysis; crystallization behavior; rheological behavior
Online: 6 February 2018 (00:36:44 CET)
To improve the interfacial bonding of sisal fiber reinforced polylactide biocomposites, polylactide (PLA) and sisal fibers (SF) were melt-blended to fabricate bio-based composites via in situ reactive interfacial compatibilization with the addition of an epoxy-functionalized oligomer (ADR). The FTIR analysis and SEM characterization demonstrated that PLA molecular chain was bonded to the fiber surface and epoxy-functionalized oligomer played a hinge-like role between sisal fibers and PLA matrix, which resulted in improved interfacial adhesion between fibers and PLA matrix. The interfacial reaction and microstructures of composites were further investigated by thermal and rheological analyses, which indicated that the mobility of the PLA molecular chain in composites was restricted because of the introduction of ADR oligomer, which in turn reflected the improved interfacial interaction between SF and PLA matrix. These conclusions were further investigated by the calculated activation energies of glass transition relaxation (△Ea) of composites via dynamic mechanical analysis. The mechanical properties of PLA/SF composites were simultaneously reinforced and toughened via addition of ADR oligomer. The interfacial interaction and structure-properties relationship of composites are key points of this study.
ARTICLE | doi:10.20944/preprints201710.0141.v1
Subject: Engineering, Control & Systems Engineering Keywords: Vertical coal bunker; Coal given chamber; Floor heave; Wall-mounted coal bunker; Reinforcement; Self-bearing system
Online: 20 October 2017 (15:31:57 CEST)
Serious damage caused by floor heave in the coal given chamber of a vertical coal bunker is one of the challenges faced in underground coal mines. Engineering practice shows that it is more difficult to maintain the coal given chamber (CGC) than a roadway. More importantly, repairing the CGC during mining practice will pose major safety risks and reduce production. Based on the case of the serious collapse that occurred in the bearing structure of the CGC at the lower part of the 214# coal bunker in Xiashijie mine, China, this work analysed (i) the main factors influencing floor heave and (ii) the failure mechanism of the load-bearing structure in the CGC using FLAC2D numerical models and expansion experiment. The analysis results indicate that: the floor heave, caused mainly by mine water, is the basic reason leading to the instability and repeated failure of the CGC in the 214# coal bunker. Then a new coal bunker, without building the CGC, is proposed and put into practice to replace the 214# coal bunker. The FLAC3D software program is adopted to establish the numerical model of the wall-mounted coal bunker (WMCB), and the stability of the rock surrounding the WMCB is simulated and analysed. The results show that: (1) the rock surrounding the sandstone segment is basically stable. (2) The surrounding rock in the coal seam segment, which moves into the inside of the bunker, is the main zone of deformation for the entire rock mass surrounding the bunker. Then the surrounding rock is controlled effectively by means of high-strength bolt–cable combined supporting technology. According to the geological conditions of the WMCB, the self-bearing system, which includes (i) H-steel beams, (ii) H-steel brackets, and (iii) self-locking anchor cables, is established and serves as a substitute for the CGC to transfer the whole weight of the bunker to stable surrounding rock. The stability of the new coal bunker has been verified by field testing, and the coal mine has gained economic benefit to a value of 158.026174 million RMB over three years. The new WMCB thus made production more effective and can provide helpful references for construction of vertical bunkers under similar geological conditions.
Subject: Engineering, Civil Engineering Keywords: structural safety assessment; experimental monitoring; strain transducers; reinforcement; civil engineering; optical fiber sensors; life time structural monitoring; Brillouin
Online: 4 June 2020 (03:54:44 CEST)
This work describes a new transducer prototype for continuous monitoring both in the structural and geotechnical fields. The transducer is synthetically constituted by a wire of optical fiber embedded between two fiber tapes (fiberglass or carbon fiber) and glued by a matrix of polyester resin. The fiber optical wire ends have been connected to a control unit whose detection system is based on Brillouin optical time-domain frequency analysis. Three laboratory tests were carried out to evaluate the sensor's reliability and accuracy. In each experiment, the transducer was applied to a sample of inclinometer casing sets in different configurations and with different constraint conditions. The experimental data collected were compared with theoretical models and with data obtained from the use of different measuring instruments to perform validation and calibration of the transducer at the same time. Several diagrams allow comparing the transducer and highlighting its suitability for monitoring and maintenance of structures. The characteristic of the transducer suggests its use as a mixed system for reinforcing and monitoring, especially in lifetime maintenance of critical infrastructures such as transportation and service networks, and historical heritage.
ARTICLE | doi:10.20944/preprints202205.0027.v1
Subject: Engineering, Industrial & Manufacturing Engineering Keywords: Additive manufacturing (AM); Wire arc additive manufacturing (WAAM); Weld cladding; Residual stresses; Reinforcement; Hole drilling method; LS-Dyna; Numerical Simulation
Online: 5 May 2022 (08:46:31 CEST)
Cladding is typically used to protect components from wear and corrosion while also improving the aesthetic value and reliability of the substrate. The cladding process induces significant residual stresses due to the temperature difference between the substrate and the clad layer. However, these residual stresses could be effectively utilized by modifying processes and geometrical parameters. This paper introduces a novel methodology for using the weld-cladding process as a cost-effective alternative to various existing reinforcement techniques. The numerical analyses are performed to maximize the reinforcement of a cylindrical tool. The investigation of how the weld cladding develops compressive stresses on the specimen in response to a change in the weld beads and the welding sequence is presented. For the benchmark shape, experimental verification of the numerical model is performed. The impact of the distance between the weld beads and the effect of the tool diameter is numerically investigated. Furthermore, the variation in compressive stresses due to temperature fluctuations during the extrusion process has been evaluated. The results showed that adequate compressive stresses are generated on the welded parts through the cladding process after cooling. Hence, the targeted reinforcement of the substrate can be achieved by optimizing the welding sequence and process parameters.
ARTICLE | doi:10.20944/preprints202108.0018.v1
Subject: Physical Sciences, Radiation & Radiography Keywords: deep reinforcement learning; source search and localization; active search; gamma radiation; source parameter estimation; sequential decision making; non-convex environment}
Online: 2 August 2021 (11:14:24 CEST)
Rapid search and localization for nuclear sources can be an important aspect in preventing human harm from illicit material in dirty bombs or from contamination. In the case of a single mobile radiation detector, there are numerous challenges to overcome such as weak source intensity, multiple sources, background radiation, and the presence of obstructions, i.e., a non-convex environment. In this work, we investigate the sequential decision making capability of deep reinforcement learning in the nuclear source search context. A novel neural network architecture (RAD-A2C) based on the actor critic (A2C) framework and a particle filter gated recurrent unit for localization is proposed. Performance is studied in a randomized 20 x 20 m convex and non-convex environment across a range of signal-to-noise ratio (SNR)s for a single detector and single source. RAD-A2C performance is compared to both an information-driven controller that uses a bootstrap particle filter and to a gradient search (GS) algorithm. We find that the RAD-A2C has comparable performance to the information-driven controller across SNR in a convex environment and at lower computational complexity per action. The RAD-A2C far outperforms the GS algorithm in the non-convex environment with greater than 95% median completion rate for up to seven obstructions.
ARTICLE | doi:10.20944/preprints201907.0311.v1
Subject: Engineering, Automotive Engineering Keywords: Cyber-Physical Systems; reliability assessment; Internet-of-Things; LiDAR sensor; driving assistance; obstacle recognition; reinforcement learning; Artificial Intelligence-based modelling
Online: 28 July 2019 (12:38:28 CEST)
Currently, the most important challenge in any assessment of state-of-the-art sensor technology and its reliability is to achieve road traffic safety targets. The research reported in this paper is focused on the design of a procedure for evaluating the reliability of Internet-of-Things (IoT) sensors and the use of a Cyber-Physical System (CPS) for the implementation of that evaluation procedure to gauge reliability. An important requirement for the generation of real critical situations under safety conditions is the capability of managing a co-simulation environment, in which both real and virtual data sensory information can be processed. An IoT case study that consists of a LiDAR-based collaborative map is then proposed, in which both real and virtual computing nodes with their corresponding sensors exchange information. Specifically, the sensor chosen for this study is a Ibeo Lux 4-layer LiDAR sensor with IoT added capabilities. Implementation is through an artificial-intelligence-based modeling library for sensor data-prediction error, at a local level, and a self-learning-based decision-making model supported on a Q-learning method, at a global level. Its aim is to determine the best model behavior and to trigger the updating procedure, if required. Finally, an experimental evaluation of this framework is also performed using simulated and real data
ARTICLE | doi:10.20944/preprints201907.0175.v1
Subject: Engineering, Civil Engineering Keywords: masonry structures; shear walls; clay brick (CB); calcium-silicate (Ca-Si) masonry units; autoclaved aerated concrete masonry units (AAC); bed joints reinforcement; shear strength; strain angle; wall stiffness
Online: 15 July 2019 (05:37:44 CEST)
The area of Central and Eastern Europe, and thus Poland, is not exposed to effects of seismic actions. Any possible tremors can be caused by coal or copper mining. Wind, rheological effects, the impact of other objects or a non-uniform substrate are the predominant types of loading included in calculations for stiffening walls. The majority of buildings in Poland, as in most other European countries, are low, medium-high brick buildings. Some traditional materials, like solid brick (>10% of construction materials market) are still used. But autoclaved aerated concrete (AAC) and cement-sand calcium-silicate (Ca-Si) elements with thin joints are prevailing (>70% of the market) on the Polish market. Adding reinforcement only to bed joints in a wall is a satisfactory solution (in addition to confining) for seismic actions occurring in Poland that improves ULS and SLS. This paper presents results from our own tests on testing horizontal shear walls without reinforcement and with different types of reinforcement. This discussion includes 51 walls made of solid brick (CB) reinforced with steel bars and steel trusses, results from tests on 15 walls made of calcium-silicate (Ca-Si) and AAC masonry units reinforced with steel trusses and plastic meshes. Taking into account our own tests and those conducted by other authors, empirical relationships were determined on the basis of more than 90 walls. They are applicable to design and construction phase to determine the likely effect of reinforcement on cracking stress that damage shear deformation and wall stiffness.