This position paper argues that long-horizon robotics should optimize persistent autonomy, not only longer reset-based episodes. Real deployments require robots that remain safely useful over days to months while accumulating memory, adapting to evolving human preferences, recovering from inevitable failures, and managing constrained physical and computational resources. Many embodied AI evaluations still inherit the logic of episodic reinforcement learning---where environments are frequently reset and hidden human labor is often unreported---but continuous operation exposes vulnerabilities in state continuity, resource coupling, recovery, and maintenance. Although long-term autonomy is not conceptually new, recent progress in generalist robot policies, open robot datasets, and language-conditioned control makes persistence a primary machine-learning evaluation target rather than a deferred downstream systems-engineering concern. As base policies grow more competent, the practical bottlenecks of autonomy concentrate in memory staleness, hidden intervention burden, recovery loops, and maintenance debt. To align evaluation with these realities, we propose a persistent-autonomy scorecard and a layered benchmark blueprint centered on long-run service utility, intervention burden, recovery quality, proactive usefulness, memory hygiene, uptime, and wear-adjusted throughput. By treating persistence as the fundamental scientific object, modern robot learning can focus on systems that turn calendar time into compounding competence rather than relying on isolated task success.