Configuring routers has traditionally been an expert driven process with static or rule-based configurations for individual flows. However, in dynamically varying traffic conditions, as are proposed in 5G use cases, these traditional approaches can generate sub-optimal configurations.
In this paper, we propose a solution to this problem based on model-based Reinforcement Learning (RL) where the environment is modeled as a Partial Order Markov Decision Process (POMDP). The system is trained over different configurations for router ingress queue traffic policing, egress queue traffic shaping and traffic conditions to learn the state transition and observation probabilities of the POMDP model. The optimal policy of the learned model can then be deployed to generate optimal queue configurations automatically, adapting to current traffic conditions. These aspects are demonstrated over a real Ericsson use case with configured routers for 5G slicing.
Not yet published in IEEEXplore
Authors:
Ajay Kattepur, Sushant David, Swarup Mohalik
Presented in the Data Driven Intelligent Networking Workshop at the IEEE International Conference on Communications (ICC), June, 2021.
© 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse.