Article
Version 1
Preserved in Portico This version is not peer-reviewed
Proximal Policy Optimization for Radiation Source Search
Version 1
: Received: 31 July 2021 / Approved: 2 August 2021 / Online: 2 August 2021 (11:14:24 CEST)
How to cite: Proctor, P.; Teuscher, C.; Hecht, A.; Osiński, M. Proximal Policy Optimization for Radiation Source Search. Preprints 2021, 2021080018. https://doi.org/10.20944/preprints202108.0018.v1 Proctor, P.; Teuscher, C.; Hecht, A.; Osiński, M. Proximal Policy Optimization for Radiation Source Search. Preprints 2021, 2021080018. https://doi.org/10.20944/preprints202108.0018.v1
Abstract
Rapid search and localization for nuclear sources can be an important aspect in preventing human harm from illicit material in dirty bombs or from contamination. In the case of a single mobile radiation detector, there are numerous challenges to overcome such as weak source intensity, multiple sources, background radiation, and the presence of obstructions, i.e., a non-convex environment. In this work, we investigate the sequential decision making capability of deep reinforcement learning in the nuclear source search context. A novel neural network architecture (RAD-A2C) based on the actor critic (A2C) framework and a particle filter gated recurrent unit for localization is proposed. Performance is studied in a randomized 20 x 20 m convex and non-convex environment across a range of signal-to-noise ratio (SNR)s for a single detector and single source. RAD-A2C performance is compared to both an information-driven controller that uses a bootstrap particle filter and to a gradient search (GS) algorithm. We find that the RAD-A2C has comparable performance to the information-driven controller across SNR in a convex environment and at lower computational complexity per action. The RAD-A2C far outperforms the GS algorithm in the non-convex environment with greater than 95% median completion rate for up to seven obstructions.
Supplementary and Associated Material
https://github.com/peproctor/radiation_ppo: Code for the project.
Keywords
deep reinforcement learning; source search and localization; active search; gamma radiation; source parameter estimation; sequential decision making; non-convex environment}
Subject
Physical Sciences, Radiation and Radiography
Copyright: This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Comments (0)
We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.
Leave a public commentSend a private comment to the author(s)
* All users must log in before leaving a comment