Submitted:
05 June 2026
Posted:
08 June 2026
You are already at the latest version
Abstract
Keywords:
1. Introduction and Positioning
1.1. Physical AI as Embodied, Energy-Constrained Intelligence
1.2. Why a Reflex-Policy Perspective Is Needed
1.3. Contribution and Limits of This Perspective
2. Reflex and Policy Layers: Definitions, Cooperation and EROIE
2.1. Reflex Layers and Policy Layers
2.2. Mutual Protection Between Reflex and Policy
2.3. Layer Spectrum and Minimal Sufficient Inference
| Level | Inference type | Typical example | Implementation | Energy / latency envelope |
| 0 | Direct physical coupling | Mechanical fuse, passive thermostat | Material or mechanical structure | Passive / zero-power |
| 1 | Binary threshold | Overcurrent, PV bypass, collision flag | Comparator or binary crossbar | pJ-nJ, ns-us |
| 2 | Analog coordination | Tilt correction, hysteresis, timers | Op-amps, analog logic | nJ, us |
| 3 | Deterministic digital reflex | PID, sequence, feedforward table | MCU, CPLD, FPGA | uJ, us-ms |
| 4 | Local policy | Adaptive MPPT, terrain selection | Edge NPU, DSP | mJ, ms-s |
| 5 | Global policy | Fleet learning, large VLA/world model | Cloud, GPU cluster | J+, seconds-minutes |
2.4. Formalizing EROIE
Incremental Form for Comparing Architectures
3. Biological Grounding
| Layer | Biological analogy | Main function | Engineering analogy |
| 1 Physical Reflex | Spinal reflex arc | Immediate protective action | Comparator / binary crossbar near sensor-actuator path |
| 2 Analog Reflex | Brainstem, vestibular reflexes, CPGs | Multi-signal coordination | Analog logic, timers, hysteresis |
| 3 Digital Reflex | Cerebellar feedforward control | Learned deterministic routines | MCU/CPLD/FPGA with fixed-point routines |
| 4 Local Policy | Basal ganglia | Action selection, habits, local adaptation | NPU/DSP/edge optimizer |
| 5 Global Policy | Cortex | Planning, abstraction, long-term learning | Cloud/GPU/large AI model |




4. A Five-Layer Reflex-Policy Architecture
4.1. Layer 1 - Physical Reflex

4.2. Layer 2 - Analog Reflex

4.3. Layer 3 - Digital Reflex

4.4. Layers 4 and 5 - Local and Global Policy


4.5. Cross-Layer Interfaces and Failure Behavior
| Failure condition | Expected behavior | Minimum safe layer |
| Layer 5 unavailable | Layer 4 continues local policy; Layers 1-3 remain active | Layer 1/2 for safety |
| Layer 4 overloaded | Layer 3 deterministic routines continue; urgent events remain local | Layer 1/2 |
| Layer 3 firmware fault | Analog and physical reflexes hold safe states or inhibit actuation | Layer 1/2 |
| Sensor mismatch | Policy detects repeated false negatives/positives and updates sensing or thresholds | Layer 4/5 for diagnosis |
| Rule-map update error | Formal constraints and rollback prevent unsafe reflex permissions | Layer 1 safe default |
5. Spintronics as a Candidate Reflex Substrate
5.1. Binary Availability and the Architecture Gap
| Technology element | Current maturity | Relevance to this paper | Caution |
| Discrete MRAM / STT-MRAM | Commercial products | Shows non-volatile binary magnetic memory availability | Memory product, not reflex module |
| Embedded MRAM / foundry macros | Foundry-level offering in selected nodes | Supports integration with CMOS control logic | Access depends on node and design rules |
| Binary MTJ crossbar with comparator readout | Research/prototype target | Candidate Layer 1 rule fabric | Needs reflex-specific validation |
| Safety-certified reflex module | Future system product | Potential robotics/energy/IoT component | Requires verification and qualification |
5.2. Decision-Power Path Isolation

5.3. Why Binary, Not Primarily Multilevel AI Acceleration
5.4. Alternatives and Honest Comparison
| Substrate | Strengths | Weaknesses | Best role in this framework |
| CMOS comparators | Mature, cheap, certifiable | Limited rule density; volatile configuration unless supported | Simple Layer 1 thresholds |
| SRAM / LUT / FPGA | Fast, programmable | Volatile, leakage, configuration overhead | Layer 3 deterministic reflexes |
| RRAM / memristor crossbar | Dense, in-memory compute | Variability, endurance and analog readout challenges | Research alternative for Layers 1-4 |
| FeFET arrays | CMOS-friendly, non-volatile | Endurance and process maturity vary | Alternative non-volatile rule memory |
| Binary MTJ / MRAM crossbar | Non-volatile, fast read, high endurance potential, magnetic robustness | Reflex crossbar not yet productized; design/safety validation needed | Layer 1 rule maps and configuration storage |
6. Application Windows
6.1. PV Harvesting and Battery Balancing
6.2. Humanoid and Mobile Robotics
6.3. Energy-Harvesting IoT
7. Comparison with Existing Architectures
| Architecture | What it already provides | What is missing for this paper’s goal | Relation to proposed framework |
| Brooks subsumption | Layered behavior and robust local control [7,8] | Not explicitly ADC/energy/EROIE driven | Historical precedent for layered competence |
| Hierarchical RL | Temporal abstraction and sub-policy learning [9] | Often compute-centric and not hardware-local | Useful for policy-to-reflex training |
| ROS 2 / micro-ROS | Modular middleware and embedded integration [59] | Middleware does not define physical reflex rule fabrics | Complementary system software |
| Mixed-criticality systems | Temporal isolation and criticality-aware scheduling [60] | Mostly software/scheduling boundary, not energy-return metric | Useful verification precedent |
| Neuromorphic / SNN | Event-driven low-energy computation [16,17,18] | Does not by itself assign events to physical energy paths | Candidate implementation style |
| Physical neural networks | Computation in physical substrates [19,20] | Usually focuses on learned inference, not safety reflex rules | Adjacent hardware paradigm |
| Proposed reflex-policy | ADC-light/ADC-heavy partition, mutual protection, EROIE | Needs prototype and formal validation | Perspective and research roadmap |
7.1. Tightened Novelty Claims
7.2. Validation Roadmap
7.3. Limitations
8. Conclusion
Acknowledgments
Conflicts of Interest
Acronym Table
| Acronym | Meaning | Use in this paper |
| ADC | Analog-to-Digital Converter | Conversion avoided or minimized in low reflex layers |
| AI | Artificial Intelligence | General field |
| ANN | Artificial Neural Network | Conventional neural-network model |
| BEOL | Back-End-of-Line | CMOS integration level relevant to MTJs |
| BMS | Battery Management System | Battery supervision and safety |
| CMOS | Complementary Metal-Oxide-Semiconductor | Mainstream semiconductor technology |
| CPG | Central Pattern Generator | Biological rhythmic-control circuit |
| CPLD | Complex Programmable Logic Device | Deterministic digital reflex implementation |
| DSP | Digital Signal Processor | Embedded signal processing |
| EROIE | Energy Returned on Invested Energy for Embodied Intelligence | Scale-down metric proposed in this paper |
| FPGA | Field-Programmable Gate Array | Deterministic programmable control |
| HRL | Hierarchical Reinforcement Learning | Policy decomposition baseline |
| IoT | Internet of Things | Distributed sensing domain |
| MCU | Microcontroller Unit | Embedded controller |
| MPPT | Maximum Power Point Tracking | PV energy-harvesting control |
| MRAM | Magnetoresistive Random-Access Memory | Non-volatile spintronic memory |
| MTJ | Magnetic Tunnel Junction | Core binary spintronic device |
| NPU | Neural Processing Unit | Edge AI accelerator |
| PID | Proportional-Integral-Derivative | Classical control method |
| PNN | Physical Neural Network | Computation in physical substrates |
| PV | Photovoltaic | Solar energy harvesting |
| RL | Reinforcement Learning | Policy optimization |
| SNN | Spiking Neural Network | Event-driven neuromorphic model |
| TRL | Technology Readiness Level | Maturity framing for devices and systems |
| VLA | Vision-Language-Action | Embodied AI model family |
References
- Sitti, M. Physical intelligence as a new paradigm. Extreme Mechanics Letters 46, 101340 (2021). [CrossRef]
- Ray, P. P. Physical AI: Bridging the sim-to-real divide toward embodied, ethical, and autonomous intelligence. Machine Learning for Computational Science and Engineering 2, 1 (2026). [CrossRef]
- Brohan, A. et al. RT-2: Vision-language-action models transfer web knowledge to robotic control. arXiv:2307.15818 (2023).
- Kim, M. J. et al. OpenVLA: An open-source vision-language-action model. arXiv:2406.09246 (2024).
- Ha, D. and Schmidhuber, J. World models. arXiv:1803.10122 (2018).
- Hasani, R., Lechner, M., Amini, A., Rus, D. and Grosu, R. Liquid time-constant networks. Proceedings of the AAAI Conference on Artificial Intelligence 35(9), 7657-7666 (2021). [CrossRef]
- Brooks, R. A. A robust layered control system for a mobile robot. IEEE Journal on Robotics and Automation 2(1), 14-23 (1986). [CrossRef]
- Brooks, R. A. Intelligence without representation. Artificial Intelligence 47(1-3), 139-159 (1991). [CrossRef]
- Botvinick, M. M. Hierarchical models of behavior and prefrontal function. Trends in Cognitive Sciences 12(5), 201-208 (2008). [CrossRef]
- Prescott, T. J. Forced moves or good tricks in design space? Adaptive Behavior 15(1), 9-31 (2007).
- Zhang, H., Solak, G. and Ajoudani, A. Bresa: Bio-inspired reflexive safe reinforcement learning for contact-rich robotic tasks. arXiv:2503.21989 (2025).
- Wulf, W. A. and McKee, S. A. Hitting the memory wall: Implications of the obvious. ACM SIGARCH Computer Architecture News 23(1), 20-24 (1995). [CrossRef]
- Barroso, L. A. and Hoelzle, U. The case for energy-proportional computing. Computer 40(12), 33-37 (2007). [CrossRef]
- Sze, V., Chen, Y.-H., Yang, T.-J. and Emer, J. S. Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE 105(12), 2295-2329 (2017). [CrossRef]
- Horowitz, M. Computing’s energy problem (and what we can do about it). IEEE International Solid-State Circuits Conference Digest, 10-14 (2014). [CrossRef]
- Mead, C. Neuromorphic electronic systems. Proceedings of the IEEE 78(10), 1629-1636 (1990). [CrossRef]
- Maass, W. Networks of spiking neurons: The third generation of neural network models. Neural Networks 10(9), 1659-1671 (1997). [CrossRef]
- Roy, K., Jaiswal, A. and Panda, P. Towards spike-based machine intelligence with neuromorphic computing. Nature 575, 607-617 (2019). [CrossRef]
- Momeni, A. et al. Training of physical neural networks. Nature 645, 53-61 (2025). [CrossRef]
- Wright, L. G. et al. Deep physical neural networks trained with backpropagation. Nature 601, 549-555 (2022). [CrossRef]
- Kandel, E. R. et al. Principles of Neural Science, 6th ed. McGraw-Hill (2021).
- Purves, D. et al. Neuroscience, 6th ed. Oxford University Press (2018).
- Sherrington, C. S. The Integrative Action of the Nervous System. Yale University Press (1906).
- Jackson, J. H. The Croonian lectures on evolution and dissolution of the nervous system. British Medical Journal (1884).
- York, G. K. and Steinberg, D. A. Hughlings Jackson’s neurological ideas. Brain 134(10), 3106-3113 (2011). [CrossRef]
- Grillner, S. Biological pattern generation: The cellular and computational logic of networks in motion. Neuron 52(5), 751-766 (2006). [CrossRef]
- Grillner, S. and Robertson, B. The basal ganglia over 500 million years. Current Biology 26(20), R1088-R1100 (2016). [CrossRef]
- Marder, E. and Bucher, D. Central pattern generators and the control of rhythmic movements. Current Biology 11(23), R986-R996 (2001). [CrossRef]
- Angelaki, D. E. and Cullen, K. E. Vestibular system: The many facets of a multimodal sense. Annual Review of Neuroscience 31, 125-150 (2008). [CrossRef]
- Cullen, K. E. The vestibular system: Multimodal integration and encoding of self-motion for motor control. Journal of Neurophysiology 107(3), 727-738 (2012). [CrossRef]
- Ito, M. Cerebellar circuitry as a neuronal machine. Progress in Neurobiology 78(3-5), 272-303 (2006). [CrossRef]
- Wolpert, D. M., Miall, R. C. and Kawato, M. Internal models in the cerebellum. Trends in Cognitive Sciences 2(9), 338-347 (1998). [CrossRef]
- Albus, J. S. A theory of cerebellar function. Mathematical Biosciences 10(1-2), 25-61 (1971). [CrossRef]
- Marr, D. A theory of cerebellar cortex. Journal of Physiology 202(2), 437-470 (1969). [CrossRef]
- Doya, K. Complementary roles of basal ganglia and cerebellum in learning and motor control. Current Opinion in Neurobiology 10(6), 732-739 (2000). [CrossRef]
- Miller, E. K. and Cohen, J. D. An integrative theory of prefrontal cortex function. Annual Review of Neuroscience 24, 167-202 (2001). [CrossRef]
- Card, G. and Dickinson, M. H. Visually mediated motor planning in the escape response of Drosophila. Current Biology 18(17), 1300-1307 (2008). [CrossRef]
- Morimoto, M. M. et al. Spatial readout of visual looming in the central brain of Drosophila. eLife 9, e57685 (2020). [CrossRef]
- Dorkenwald, S. et al. Neuronal wiring diagram of an adult brain. Nature 634, 124-138 (2024). [CrossRef]
- Everspin Technologies. MRAM product information. Retrieved May 2026 from everspin.com.
- GlobalFoundries. Embedded memory and MRAM technology information. Retrieved May 2026 from gf.com.
- Samsung Foundry. Specialty technology: eMRAM. Retrieved May 2026 from semiconductor.samsung.com.
- Renesas Electronics. IDT offers Avalanche Technology’s MRAM devices. News release (2019). Retrieved May 2026 from renesas.com.
- Apalkov, D., Dieny, B. and Slaughter, J. M. Magnetoresistive random access memory. Proceedings of the IEEE 104(10), 1796-1830 (2016). [CrossRef]
- Bhatti, S. et al. Spintronics based random access memory: A review. Materials Today 20(9), 530-548 (2017). [CrossRef]
- Grollier, J. et al. Neuromorphic spintronics. Nature Electronics 3, 360-370 (2020). [CrossRef]
- Torrejon, J. et al. Neuromorphic computing with nanoscale spintronic oscillators. Nature 547, 428-431 (2017). [CrossRef]
- Borders, W. A. et al. Integer factorization using stochastic magnetic tunnel junctions. Nature 573, 390-393 (2019). [CrossRef]
- Manipatruni, S. et al. Scalable energy-efficient magnetoelectric spin-orbit logic. Nature 565, 35-42 (2019). [CrossRef]
- Sebastian, A., Le Gallo, M., Khaddam-Aljameh, R. and Eleftheriou, E. Memory devices and applications for in-memory computing. Nature Nanotechnology 15, 529-544 (2020). [CrossRef]
- Ielmini, D. and Wong, H.-S. P. In-memory computing with resistive switching devices. Nature Electronics 1, 333-343 (2018). [CrossRef]
- Burr, G. W. et al. Neuromorphic computing using non-volatile memory. Advances in Physics: X 2(1), 89-124 (2017). [CrossRef]
- Gokmen, T. and Vlasov, Y. Acceleration of deep neural network training with resistive cross-point devices. Frontiers in Neuroscience 10, 333 (2016). [CrossRef]
- Esram, T. and Chapman, P. L. Comparison of photovoltaic array maximum power point tracking techniques. IEEE Transactions on Energy Conversion 22(2), 439-449 (2007). [CrossRef]
- Patel, H. and Agarwal, V. Maximum power point tracking scheme for PV systems operating under partially shaded conditions. IEEE Transactions on Industrial Electronics 55(4), 1689-1698 (2008). [CrossRef]
- Silvestre, S., Boronat, A. and Chouder, A. Study of bypass diodes configuration on PV modules. Applied Energy 86(9), 1632-1640 (2009). [CrossRef]
- Daliento, S., Mele, L. and Spirito, P. Analysis and modeling of hot spot phenomena in photovoltaic modules. IEEE Transactions on Electron Devices 59(3), 727-734 (2012).
- Cao, J., Schofield, N. and Emadi, A. Battery balancing methods: A comprehensive review. IEEE Vehicle Power and Propulsion Conference (2008). [CrossRef]
- Macenski, S., Foote, T., Gerkey, B., Lalancette, C. and Woodall, W. Robot Operating System 2: Design, architecture, and uses in the wild. Science Robotics 7(66), eabm6074 (2022). [CrossRef]
- Burns, A. and Davis, R. I. A survey of research into mixed criticality systems. ACM Computing Surveys 50(6), Article 82 (2017). [CrossRef]
- Murphy, D. J. and Hall, C. A. S. Year in review - EROI or energy return on energy invested. Annals of the New York Academy of Sciences 1185, 102-118 (2010). [CrossRef]
- Hall, C. A. S., Lambert, J. G. and Balogh, S. B. EROI of different fuels and the implications for society. Energy Policy 64, 141-152 (2014). [CrossRef]
- Safari, A., Sorouri, H., Oshnoei, A. and Blaabjerg, F. A state-of-the-art review on battery cell balancing strategies. Discover Energy 5, 31 (2025). [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).