Enabling smarter networks: capitalizing on cloud native in NWDAF use cases
Capitalizing on cloud native in NWDAF
At Ericsson, we’ve created a set of principles for telecom applications that use microservices and state-optimized design to take full advantage of the cloud. These principles are allowing us to increase the granularity and speed of software upgrades and releases, and adapt the software architecture to make better use of cloud data center computing resources. Cloud native goes beyond just describing the patterns around system, also extends to the practices around organizations and technologies – in fact, it’s completely transforming the way networks are operated. That’s why we made the strategic decision to base our NWDAF offering on cloud native 5G Core (5GC) and Ericsson Expert Analytics (EEA), enabling NWDAF to inherently benefit from all the advantages that cloud native brings.
This has resulted in our NWDAF offering being:
- Modular and efficient
- Aligned to ensure model/use case portability across products,
- Extensible and future proof, and
- Able to extend the data-driven approach building on software (SW) probe.
Below we will discuss these characteristics in detail.
Modular and efficient
Ericsson 5G Core is built on cloud-native, microservices-based technology that combines Evolved Packet Core (EPC) and 5G Core (5GC) network functions for footprint optimization and increased total cost of ownership (TCO) efficiency. Products in this offering are based on Ericsson's Application Development Platform (ADP) framework, which provides a set of architecture principles, design rules and best practices to guide the fundamental design decisions for all cloud- native applications, including EEA. In addition, the framework provides software assets/components that enable applications to fulfill key design principles. It leverages web-scale technology from the Cloud Native Computing Foundation (CNCF), and other open-source projects to encourage inner sourcing of reusable assets. A service user is not exposed to all the dependencies of the service, thereby allowing microservices to be changed without impacting existing integrations (with some exceptions). The ADP framework helps to build a modular architecture by choosing which microservices are essential for a product based on specific requirements.
Ericsson is taking a phased approach to building this architecture to align with our use case-based strategy, as discussed in the blog post, NWDAF: the blessing 5G Core needs to reach a data-driven network. A typical cloud-native deployment consists of two groups of microservices as shown in Figure 1:
- A common set that will support non-functional capabilities required within the product, such as those related to operation and maintenance / software (SW) probe
- Microservices that execute the actual network function (NF) business logic
Common services, such as those supporting configuration management/performance management, are shared with the 3GPP network function business logic in the existing 5G Core products when we bring in NWDAF in the same deployments. These deployments will optimize the cost incurred on these common services and qualify the efficiencies achieved with our current strategy by using existing cloud-native products to host the NWDAF functionality.
To enable the machine learning (ML)/artificial intelligence (AI) platform that constitutes the NWDAF, we are bringing a reusable set of microservices that will be deployed in the product as common services – visualized in figure 1.
It is good to define what a model is before we describe the ML/AI platform further. A model is a procedure representing a mathematical algorithm which is trained over a set of data. The model will take a defined set of data as input and generate a result given the model signature (model input and output) which stems from earlier training.
Training refers to the process of analyzing a data set for a particular use case and creating a machine learning model. This creates a software artifact that utilizes ML libraries and other components and model parameters to define the model's behavior. Among other things it includes data preprocessing implementation, which describes or implements how the data transformation procedure is carried out.
In the first phase of the development, we include model serving infrastructure in the products as part of the inference phase. Inference refers to the process of using the trained machine learning model to get insights in production. This is where the model is executed in the target environment, using new data (not seen in the model development phase) as input. The inferences are exposed as insights on the 3GPP compliant interface. We are also looking to leverage some reusable services contributed by Machine Learning Execution Environment (MXE) from Ericsson’s Global Artificial Intelligence Accelerator (GAIA), as explained further in the last section of the blog.
Aligned to ensure model/use case portability across products
In our solution the two deployment options – the co-located and standalone NWDAF – can be perceived as a duplication of an inference pipeline of a unified machine learning operations (MLOPS) pipeline, with models trained in a common or identical training pipeline. Personally, I believe this scenario is the simplest to manage when it comes to model sharing, as Ericsson takes responsibility for the training and inference pipeline.
The model serving infrastructure in the inference pipeline is the key element for determining how to effectively share a model and, therefore, must be chosen aptly to balance interoperability/innovation and cost. Different artifacts can be involved in the model sharing depending on the capabilities of the target inference pipeline, as described in the previous section. Considering a minimalist approach –covering only the essential artifacts – the model will be packaged as an Open Container Initiative (OCI) docker image: running as containers in a microservices architecture. The major benefit this would bring is the possibility to handle the models exactly like any other software artifact and allow our choice of orchestrator, Kubernetes.
We aspire to have the model serving platform in different products – at least in 5G Core and EEA, which constitute our strategic NWDAF offering. We are striving to build products that use the same set of reusable microservices to enable ‘off the shelf’ model sharing between them. Communication service providers (CSPs are able to leverage the flexibility of our distributed offering to build tailor-made solutions for their needs.
Uncover what lies ahead for data-driven networks
Extensible and future proof
The architecture is agnostic to use cases and models deployed, opening the opportunity to leverage it for any future ones. Of course, different use cases may require other characteristics based on the data storage and compute resources needed. Still, the beauty of cloud native is that we can scale dynamically based on the individual needs of the customer.
In addition, we can also explore if the ML platform needs to be extended to include additional steps in the model lifecycle into the product. For example, model retraining, where the model is trained with target environment data – part of product deployment in customer network – could be realized by including additional services. The reason for model retraining is usually due to detecting drift in the CSP's environment when it comes to the expected prediction accuracy of the products. These services could be a subset of those used in the model development phase above.
Able to extend the data-driven approach building on software (SW) probe.
At the heart of what we are trying to do is data-driven architecture in our products. Our software (SW) probe enables intelligent data acquisition, which we intend to leverage to optimize the data the ML/AI framework handles. The use cases that will identify the deployed models which in turn will define the data acquired using SW probe event reporting. We can achieve this by tightly integrating built-in NWDAF with SW probe event reporting.
5G Core products are not new to intelligent and correlated information extracted from the 3GPP network function context on north bound interface (NBI). We have had such events available since 2008 in Packet Core products: with results highlighting stability, low footprint and an excellent track record for troubleshooting and data analytics. In addition, this data interface is highly efficient, as it offers light data records which can be used for most use cases – in contrast to the tapping of raw packets and external packet analysis which is resource intensive. Events also include cause codes and sub cause codes: additional data generated by leveraging Ericsson’s knowledge of 3GPP procedures to provide more details about failure incident reports.
EEA has leveraged this data north bound interface (NBI) from our legacy products, to help cloud native 5G Core deliver advanced analytics use cases. With the introduction of the SW probe event reporting feature, we are making data acquisition smarter by introducing advanced filtering and sampling. Cloud native EEA has a footprint-optimized offering built on these capabilities in SW probe event reporting.
Figure 2 shows what any such event record will typically contain for 3GPP signaling on the control plane. This event record is comprehensive in terms of information and includes common information – such as subscriber info between events from different 3GPP NFs – which ensures that the events can be correlated for an end-to-end (E2E) view of the network.
In our technology leadership initiatives, we have been working closely with tier one CSPs. We have identified that the use cases typically requested by CSPs can already be implemented using the existing event data we expose through SW probe event reporting. This gives us a unique advantage to listen to our customers and build our NWDAF focusing on use cases / insight exposure, instead of 3GPP compliance to event exposure. That being said, we still have the ambition to adhere to the standards needed to support third-party vendor integration, and this is reflected in our roadmap.
If we then evaluate the event exposure interface defined in 3GPP, we see some overlap with the content of our existing event exposed on SW probe event reporting. There are also event types that will require on aggregation over a period, meaning we may have to retain the event data to some extent. We see coherence in the role SW probe takes in smarter data acquisition and the publish/subscribe interface described for event exposure. We are evaluating our options on how event exposure can be built to be consistent with the data architecture we’ve put in place.
What is the value we are looking to deliver?
The use case centric approach is integral to our NWDAF strategy. It will be interesting to qualify the value we are delivering by looking closely at one of the use cases where ML is applied already in our existing mobility management entity (SGSN-MME) product: ML Assisted Paging.
In wireless communication, the network usually needs to get the user equipment (UE) back to the connected mode by paging procedure. Since the UE may change the location during the idle mode, to quickly locate the UE by reduced paging signaling, the probabilistic eNodeB list paging, powered by machine learning, is introduced. The probabilistic eNodeB list paging performs statistical analysis on the historical data to instruct the MME to get the current probable location of the UE. By using machine learning, the probabilistic eNodeB list paging is improving paging efficiency and reducing paging signaling: becoming topology-aware and reducing signaling, which in turn frees up capacity and reduces capital expenditures (CAPEX) for carriers. This has resulted in up to 80 percent reduced signaling (paging). We are building the same use case also for AMF paging optimization.
Something worth highlighting is the supervised closed loop we have implemented: using our standard configuration management interface, the operator can define a set of rules that will control if and how the ML inferences will be applied to the paging procedures. In this case it will be ok to apply this configuration ahead of the actual ML inference generation. If the use case demands and allows it, the same control can be provided to the operator between inference generation and resulting action so that the operator must acknowledge or even validate the action before it takes effect. Of course, this may not be the case with all use cases, especially if low latency requirements are needed to close the loop– in that scenario, the operator could also govern it by defining a policy as described in the use case above.
How are we working with ML/AI towards building NWDAF?
As 5G gains traction, it will bring new complexities to network operations and will be used to address network performance and automation. This forms the context of Ericsson's concentrated effort in AI with a vibrant community across the company being extremely active in this technology space. It includes Ericsson Research and also brings technology leadership teams closer to the products. There are already solutions in action in customer networks.
When we started with our NWDAF implementation, we drew on the knowledge of our experienced pool of data scientists who had successfully applied ML/AI in customer networks. We are leveraging this to get a head start in delivering the NWDAF solution. Global Artificial Intelligence Accelerator (GAIA) is key in accelerating the execution of Ericsson's focused strategy by utilizing innovative AI and automation technologies to create data driven, intelligent and robust systems for automation, evolution and growth. They are contributing to the modular Machine Learning Execution Environment (MXE), an internal cloud native framework for ML/AI. MXE in turn is contributes to the ADP ecosystem with reusable common services which can be leveraged in Ericsson products, as shown in Figure 1.
Packet core technology leadership team has worked closely with CSPs on NWDAF use cases as part of a proof of concept leveraging the existing event reporting data. We were able to deploy and run ML models and better understand the value of core network data. This is also helping us better define use cases with clear business values. Additionally, we have experience using ML/AI in existing products in the Packet Core portfolio to improve the existing patented MME features of adaptive paging, as described earlier.
To feasibly support CSPs we have taken a use case-based approach. This means we’ve carefully selected use cases to develop, and haven’t only included those ‘frozen’ in 3GPP Release 16 and 17. We are also working with customers to identify newer use cases, as well as prioritizing the right ones. The following text provides an indicative list of use cases being considered, which can be divided in three main categories:
- Mobility tracking/prediction
- ML Assisted paging
- Mobility predictions
- Behavior tracking/prediction
- UE Behavior handling
- Predictive congestion management
- Quality of service (QoS) tracking/prediction
- Enterprise service level agreements (SLA)assurance
- User Plane Selection (UPF) selection and reselection
- Adaptive policies
- Optimized best data plan
- Slice selection based on load
This ML/AI eco system and our proven experience working with core network event data leaves us with a unique advantage to bring a flexible and mature NWDAF solution to the market.
Want to know more?
Read the blog post: How to overcome the challenge with probing in cloud native 5G Core
Read more about Ericsson’s NWDAF solution
Evolve your core network for 5G Core
Discover how cloud native is transforming the telcom industry
Explore how we are securing 5G experience with software probes
Like what you’re reading? Please sign up for email updates on your favorite topics.
Subscribe nowAt the Ericsson Blog, we provide insight to make complex ideas on technology, innovation and business simple.