Brown, H.; Fricke, K.; Yoneki, E. World-Models for Bitrate Streaming. Appl. Sci.2020, 10, 6685.
Brown, H.; Fricke, K.; Yoneki, E. World-Models for Bitrate Streaming. Appl. Sci. 2020, 10, 6685.
Adaptive bitrate (ABR) algorithms optimize the quality of streaming experiences for users in client-side video players especially in unreliable or slow mobile networks. Several rule-based heuristic algorithms can achieve stable performance, but they sometimes fail to adapt properly to changing network conditions. Fluctuating bandwidth may cause algorithms to default to behavior that creates a negative experience for the user. ABR algorithms can be generated with reinforcement learning, a decision-making paradigm in which an agent learns to make optimal choices through interactions with an environment. Training reinforcement learning algorithms for bitrate streaming requires building a simulator for an agent to experience interactions quickly; training an agent in the real environment is infeasible due to the long step times in real environments. This project explores using supervised learning to construct a world-model, or a learned simulator, from recorded interactions. A reinforcement learning agent trained inside of the learned model, rather than a simulator, can outperform rule-based heuristics. Furthermore, agents trained inside the learned world-model can outperform model-free agents in low sample regimes. This work highlights the potential for world-models to quickly learn simulators, and to be used to generate optimal policies.
reinforcement learning; bitrate streaming; world-models; video streaming; model-based reinforcement learning
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.