Like what you’re reading?

Subscribe now!

Trustworthy AI: explainability, safety and verifiability

As AI becomes more prominent across the globe, building trustworthy AI systems is paramount. Here, we discuss the foundations in the design of every AI system: explainability, safety and verifiability.

Dec 21, 2020 | 11 min.

Anusha Mujumdar

Kristijonas Čyras

Experienced Researcher, AI

Saurabh Singh

Automation and AI Development Lead at Business Area Managed Services

Aneta Vulgarakis Feljan

Sector Manager, Machine Reasoning and Hybrid AI

Hashtags

#AI #ArtificialIntelligence #explainability #EricssonResearch

Dec 21, 2020 | 11 min.

Anusha Mujumdar

Kristijonas Čyras

Experienced Researcher, AI

Saurabh Singh

Automation and AI Development Lead at Business Area Managed Services

Aneta Vulgarakis Feljan

Sector Manager, Machine Reasoning and Hybrid AI

Anusha Mujumdar

Contributor (+3)

Kristijonas Čyras

Experienced Researcher, AI

Saurabh Singh

Automation and AI Development Lead at Business Area Managed Services

Aneta Vulgarakis Feljan

Sector Manager, Machine Reasoning and Hybrid AI

Hashtags

#AI #ArtificialIntelligence #explainability #EricssonResearch

Trustworthy AI is crucial to the widespread adoption of AI. We believe AI should be designed with a clear focus on a human-in-the-loop. We maintain that trustworthy AI should be built into the system by design (not as an afterthought).

There are multiple ingredients in trustworthy AI. In this post, we’ll show you how we proactively consider explainability, safety and verifiability as we set out to design AI systems. We’ll also give you a peek into how we use automated reasoning-based and symbolic AI-based approaches to build explainability and safety into our AI solutions.

How AI is perceived

Whether it’s sci-fi movies or everyday news, artificial intelligence seems to be everywhere. But perhaps unsurprisingly, there is little agreement about the details surrounding AI: what kind of threats and benefits AI poses to society for example, and whether humans can trust the AI. Stanford University’s One Hundred Year Study on AI offers some clues [page 6, Overview]:

In reality, AI is already changing our daily lives, almost entirely in ways that improve human health, safety, and productivity. Unlike in the movies, there is no race of superhuman robots on the horizon or probably even possible. And while the potential to abuse AI technologies must be acknowledged and addressed, their greater potential is, among other things, to make driving safer, help children learn, and extend and enhance people’s lives. In fact, beneficial AI applications in schools, homes, and hospitals are already growing at an accelerated pace.

Nonetheless, prominent figures, including AI research leaders, have voiced concerns about the rapid, and potentially irresponsible, development and deployment of AI-equipped tools. We acknowledge the potential risks and aim to address them by building trustworthy AI that is beneficial to society.

AI trustworthiness

How do we build AI systems that humans can trust? Isn’t trust an abstract quality? Let’s dig a little deeper – the trust equation was proposed by Charles Green and his collaborators as a fundamental principle for how humans perceive trust among each other. For example, do we have confidence in what a colleague projects, can we depend on them, do we feel safe sharing with them, and do we believe their focus is aligned with our best interests? Now replace the phrase “a colleague” with “an AI agent” – how do these answers change?

Figure 1: Charles Green’s trust equation

We believe the trust equation remains the same, in essence, when it comes to human-AI collaboration. For example, meeting human intent is central to building zero-touch systems, and this aligns well with prioritizing highest level business intent rather than self-orientation. If we marry the EU guidelines for trustworthy AI with these principles of trust, we can start addressing trustworthiness of AI by focusing on aspects of AI like safety, transparency and explainability, traceability and accountability among others, each of which puts the human in the center.

Ericsson has adopted these European Commission guidelines (read the blog post Ethics and AI: 8 steps to build trust in intelligent technology) and Ericsson Research is rising up to the challenge of building the technology pieces that will ensure our AI solutions abide by the principles of trustworthiness.

One of the greater challenges concerns the tension between rapid adoption and the trustworthy deployment of AI. When we reach out to our customers, there is a clear message that comes back: they all want to adopt AI and want to adopt it at a faster rate than ever before. A survey conducted by Ericsson late 2018 showed that this sentiment is shared among 77 percent of service providers globally and is particularly strong in South East Asia, Oceania and India (91 percent) as well as North America (83 percent).

AI will undoubtedly play a crucial role as network complexity increases and service providers face the demands of handling multiple technologies such as 4G, 5G and IoT, as well as growth in the number of connected devices. From a consumer’s point of view, this means more and more services, such as video streaming using over-the-top (OTT) services, will be managed by autonomous AI systems. Research questions that emerge from these factors include: Why under certain special circumstances might a network service fail to meet its service level agreement (SLA)? Why is coverage from a radio 4G antenna not sufficient to meet a particular service level agreement in a specific area? How can we ensure that energy efficiency actions can be implemented while keeping network level key performance indicators (KPIs)? How can we ensure robustness against uncertainty and adversarial attacks? These translate to ensuring that our AI systems managing tomorrow’s networks will be robust, safe, verifiable and explainable. In the end, we believe that the more trustworthy AI systems become, the quicker their adoption will be.

Explainable AI

AI explainability is about explaining to the users how an AI system makes decisions. How many times have you wondered why a movie was automatically recommended to you on a video streaming service? Remember being astounded or highly disappointed with the recommendation after watching the movie? Wouldn’t it be great to get some extra information as to why the AI behind the scenes recommends a particular movie? Providing explanations behind the recommendations or decisions of AI systems to make them more trustworthy is the goal of explainable AI (XAI)

Figure 2. The need for explainable AI.

We recognize that explainability is a crucial component of trustworthy AI. In short, we stipulate that an explainable AI system has to produce details and the underlying reasons for its functions, processes and output. Of course, in telecommunication network operations, we’re not talking about explaining things such as movie recommendations. Instead, we consider explainability of complex AI systems that usually consist of multiple AI components – sometimes called AI agents. Think of the video streaming service quality assurance. It entails a multitude of complex tasks where AI has become necessary:

We need to constantly monitor and predict future traffic in the network to be ready for increased user demand, for example – imagine a mass of people tuning in to watch a new episode of a show at the same time.
In case of predicted traffic spikes and therefore network congestions, we need to prepare for reconfiguring network routers and as a result relocating network resources to continue serving the video stream without delays.

And of course, we have to relocate resources without negatively affecting other services running on the network, such as automatic vehicle traffic signalling.

Figure 3. An example of the multitude of tasks tackled by AI in video streaming slice assurance scenario.

Different types of AI agents help in all these tasks:

Machine learning (ML) models may be predicting the network’s future traffic.
Rule-based systems may determine the routers most likely to be congested.
Constraint solvers may yield network reconfigurations that divert traffic from congested routers.
Autonomous planners may find how to optimally execute the reconfigurations.

In such complex settings, explanations at different levels and targeting different audiences – including end-users, service providers, network operators – become crucial. Explainability is first and foremost important to network operators who need to be able to inspect the details and reasons behind the AI system’s decisions. For instance, suppose the latency increase beyond the acceptable 25milliseconds on a video streaming service was predicted by a machine learning (ML)-based agent.

To explain what such a prediction was based on, we can invoke ML model explainers which attribute the features (pertaining to network performance counters) that were important for the prediction. Even though such explanations do not guarantee that the attributed features are all and only those that matter, they nevertheless allow the network operator to inspect whether the ML-based agents behave reasonably and therefore support trustworthiness.
Further, the attributed features may enable root cause analysis of the predicted latency increase. To that end, suppose a rule-based AI agent uses the indicated network performance counters to determine the most likely root cause(s) of the latency violation, for instance a congested router port. The human operator may want an explanation for this too, in which case we could invoke rule-tracing explanations to show which facts and what domain knowledge were used to infer the root cause.
Going further, given a root cause, the goal is to reconfigure the video streaming slice to avoid latency violation. To this end, an AI agent using constraint solving techniques aims to find a network reconfiguration, say a path through a different data center, that satisfies the slice requirements, including latency.
Finally, an AI planning agent provides a procedural knowledge-based plan for execution of the reconfiguration (i.e. how to optimally relocate network resources). To explain the decisions of these agents, we may extract contrastive explanations that detail alternative actions, their consequences and differences that indicate why the proposed solution was chosen.

Figure 4. Different explainers will explain various AI agents, and their explanations will be combined into explanations for the human operator.

In the end, suppose the video streaming service was reconfigured to use a different edge router and data center. The human operator may want to know why, at a high-level. The AI system could explain that in order to meet the latency objective, several predicted latency violations had to be resolved. To that end, alternative root causes of violations have been found and evaluated, from which various solutions were proposed, and the best solution – involving slice relocation to a different data center – was implemented.

Such explanations could be conversational, realized as dialogues between the human and the AI system, exhibiting the key considerations and weighing arguments for and against the best outcomes in all of the phases of the workings of the AI system, with the possibility for the human operator to probe deeper into the details of the workings of each AI agent involved.

Figure 5. Dialogical explanations (left) for high-level decision making could be obtained from argument graphs (right).

Explainability is also important for debugging and improving AI-equipped systems. For example, the AI system may constantly decide to compromise the cost-efficiency of the video streaming service for meeting the latency requirements. But upon inspection, a human operator may recognize a more emphatic solution, and once in a while may compromise the resource-efficiency of some other service, while still meeting all the service level agreements (SLAs). We aspire to enable such human intervention in our autonomous AI systems as well as the consequent learning by AI from human common sense and values for increasing and sustaining trust in our AI systems.

We believe that this description of explainability, in all its variants considered above, will support the trustworthiness of AI systems.

Safe AI

Concerns that AI agents could (inadvertently or maliciously) take actions that are unsafe to humans have led to significant research in the area of safety in AI. We detail here our approach to safety in one class of AI algorithms – reinforcement learning. Consider a trip to our local amusement park. We’ve visited here before, and so have some favored rides and games that we enjoy. Now we could go two ways – (1) the tried-and-tested rides we like, or (2) explore something new that could be even better, at the risk of wasting time and money on something we may not enjoy. Of course, we could choose a healthy balance where we exploit what we already know, which leads us to desirable situations, but we could also explore so that we can benefit from as yet undiscovered choices.

This represents the classic trade-off between exploration and exploitation that is seen in reinforcement learning (RL)-based agents that perceive the environment, and decide what action is to be taken to optimize for a reward (collected from the environment). In general, it is seen that greedy agents, or those that only exploit known choices, perform poorly, especially when applied in complex problems.

Exploring the space of all possible states and actions is often seen as an effective way to train a reinforcement learning agent to capture a policy that is as close to optimal as possible. However, widespread unchecked exploration has dangers, especially when it comes to RL being applied in critical systems such as those in telecom. For example, this unchecked exploration can lead the system visiting a dangerous state. In our amusement park example, this could be a ride that puts us at risk of falling off. Therefore, we would like to encourage exploration, while keeping our agent within safety boundaries.

This is the problem of safe reinforcement learning, and one that we encountered when designing a reinforcement learning-based solution for computing optimal antenna tilts to optimize for network KPIs. We saw that RL agents, trained using model-free algorithms such as deep Q-learning learned to arrive at optimal policies. However, since agents only optimize for the reward, the agents learned to “hack” it, i.e. progressively optimizing for one or more easy to achieve elements in the reward function. In some situations this would lead to other elements of the reward function taking on undesirable values. Excessive over-engineering of the reward function is clearly not the solution, since we would like the RL solution to be fairly general, for example, capturing optimal policies across communications service providers (CSPs).

What we seek is a means for the agent to freely explore the state-action space of the environment, while defining the boundaries using safety specifications defined by a human. These safety specifications, or boundaries, can vary across users. For example, CSP A may want to ensure that the received signal strength is within a defined threshold at all times, while CSP B would like high level indicators, such as throughput, to be > 100Mbps, while allowing for it to dip below that temporarily (say for 5 time units) as long as the coverage and capacity does not decay in that window.

Such specifications represent how realistic users would like to interact with intelligent systems. Today, these are defined within service level agreements, and imposed by a combination of human reasoning and specialized network tools. However, applying those specifications in RL agents is not straightforward. As it turns out, such rich specifications which encode notions of time can be expressed using an interesting class of logic known as linear temporal logic (LTL).

Specifications like these have been traditionally used in the formal verification of software and hardware systems such as chips, aircraft, and so on. Formal verification is a set of techniques that makes sure that systems – represented by their state-action transitions – are checked, and that they comply with specified properties. It holds promise for identifying safe or unsafe state traces with respect to a safety condition.

What makes it attractive is the formal guarantees on a property being satisfied by a system or not. In fact, when properties are not satisfied, it provides counterexamples – state traces that explicitly violate the property. This is exciting because our safety shield can now consume such traces and block actions that are expected to take the system into those regions of the state space.

However, there’s a challenge: model-free RL agents cannot be explicitly represented by their state transitions, and so cannot be directly verified. Back to our antenna-tilt problem. If we can represent the state-action transitions formally, using structures such as the Markov Decision Process (MDP)and define specifications based on the states present in this MDP, we can utilize tools such as Storm and Prism that enable model checking of probabilistic systems.

For building models such as MDPs, quality operator data is invaluable. Together with a large number of simulator runs, we build an “abstract” MDP – i.e. one that captures only the safety-relevant state information. Such MDPs can even be generated on demand from historical data in response to a specification. This MDP, together with the safety specification, can together allow the development of a safety shield, that blocks undesirable actions.

Figure 6: A safety shield to block unsafe actions towards the environment, that may result from free exploration of state-action spaces by RL agents.

Figure 7: Markov Decision Processes to model state action transitions observed from the experience replay of a model-free reinforcement learning agent.

Figure 8: Formal verification of probabilistic models is done by performing a cross-product between a state transition model and a specification automaton, and consequently computing all possible paths through the system that violate the specification.

^{Fig 8 is adapted from the paper Control Synthesis from Linear Temporal Logic Speciﬁcations using Model-Free Reinforcement Learning}

We see that using logical constraints to capture human-like reasoning allows RL agents to explore freely within these specified bounds. Indeed, since the specification can be rich, such an approach allows for domain-guided reinforcement learning (beyond just safety). For example, a specification such as “ensure that the capacity KPIs only improve during the course of a solution path” would be very hard to include using classical reinforcement learning techniques – how would we model a reward function that includes constraints like this? This is an example where reasoning is invaluable for developing trustworthy AI.

Verification of AI modules – some context

Formal verification is a set of powerful mathematical techniques that guarantee the correctness of a model – in our case, ensuring that certain properties are met (formal proofs). If properties are not met, these techniques may provide a counterexample, i.e. one or more traces of the system which does not satisfy the property being studied. There can be two types of answers out from formal verification: “yes/no” answers when verifying properties that are either satisfied or not, but cannot be measured, and answers in the form of numbers, when the formal verification returns a number that might represent for e.g., the maximum accumulated energy for reaching a given goal expressed as a reachability property.

Indeed, formal verification is valuable for safe AI (by ensuring safety properties are not violated in all operational modes of the system), as well as explainable AI (by providing verification proofs against properties of interest, as explanations). And so, we see that formal verification can provide a rigorous tool kit for ensuring trustworthy AI. We also envisage the use of formal verification in assisting the audit of AI-based systems.

For example, consider a user of an AI model playing a game where the aim is to collect rewards in a grid-world environment as in Figure 9, I would like to know if there is any chance a given cell will be visited. Indeed, I would like to have a guarantee that cell X will not be visited. Such problems are often approached using formal verification.

AI Figure 9 AI model.gif

Figure 9: An AI model playing in a grid world to collect rewards.

AI Fig 10.gif

Figure 10: Illustrating how formal verification examines all possible state transition paths of a system, and checks whether given logical properties are satisfied.

Although applying such techniques to AI models that evolve over time is not straightforward, some early research results exist, such as the paper, Toolkit for the Formal Design and Analysis of Artificial Intelligence-Based Systems. Indeed, the work discussed in the safe reinforcement learning section above used formal verification techniques to arrive at counterexamples in order to identify safe and unsafe states. However, formal verification can be applied more widely. One such example can be to verify and later explain why, and when, a system consumes resources by looking at a trace deduced from formal verification.

AI’s potential and the road ahead

We are just getting started with AI explainability, safety and verification – some of the main components of trustworthy AI. The path ahead is exciting, and there is a lot we can do – both within research and product development and deployment. Here at Ericsson Research we are working on developing causal models of our systems, which can greatly improve explainability, traceability, performance and reliability of AI solutions that use them.

We also see great potential in hybrid solution approaches reconciling symbolic AI with statistical learning to improve the degree of trust in AI solutions. Such approaches would allow intents to be directly considered in a system, and intents allow us to connect back to the human. Revisiting the trust equation at the beginning of this post, we see that credibility, reliability, intimacy and user-orientation can be gradually enhanced when these are transformed into technical intents and fed to the system. Of course applying these intents to an AI/ML model is not always straightforward, and this constitutes a key area we’d like to continue researching.

There are also many other classes of problems that we are just getting started on – for example, detecting when an AI system consistently gives outputs that are biased towards groups of people, and mitigation strategies for those. Traceability and assigning accountability for a given outcome is also crucial. Explaining emergent behavior in composite AI systems is another exciting research problem we are working on. And believing that AI systems can be trusted more as a result of our work gets us going!

The authors would like to thank Alexandros Nikou, Ezeddin Al Hakim, Swarup Kumar Mohalik, Sushanth David S, Alessandro Previti, and Marin Orlic for their contributions.