Skip navigation
Like what you’re reading?

AI product security: A primer for developers

AI security has evolved from solely using AI for cybersecurity to also addressing the security of AI systems. Ian Goodfellow's work, to note a few [5], [10], and [14], popularized adversarial attacks on machine learning, highlighting the need to secure AI systems. This article is addressed to developers, designers, managers, and researchers, emphasizing the importance of understanding threats to machine learning products and avoiding common pitfalls in AI product development, including AI products used in telecommunications.

Senior AI Technical Product Manager, GFTL GAIA

Distinguished Engineer

AI Product Security: A Primer for Developers

Senior AI Technical Product Manager, GFTL GAIA

Distinguished Engineer

Senior AI Technical Product Manager, GFTL GAIA

Contributor (+1)

Distinguished Engineer

The paper "Computer security checklist for non-security technology professionals" [1] presents a short checklist of activities in three groups:

  1. Perform risk analysis
  2. Conduct vulnerability assessments
  3. Education, procedures, and policies

Though published in 2006, these practices are still relevant today. However, AI Security, in today's world, goes beyond the three requirements. From understanding the specific threats in machine learning (ML) to the role of generic software product security, this article will take you on a journey through the ever-evolving landscape of AI security.

AI-specific threats in ML

The taxonomy of attacks that affect an ML product can be divided into two attack surfaces as shown below.

AI security

AI-specific attacks appear in the ML attack surface

  • Poisoning: modifying a benign training dataset.
  • Evasion: give a malicious input to get an unexpected output.
  • Oracle: stealing information by probing the model.
  • Adversarial reprogramming: use the model for a task it is not intended to do.

The traditional attack surface exploits generic vulnerabilities of a software product

  • Unauthorized access to data or the model can affect the confidentiality, integrity, and availability of the system. 
  • Malicious feedback can negatively influence a model’s development, which may limit its ability to perform as expected. 

Poisoning

Tampering with the training dataset by adding, deleting, changing, or rearranging its contents can lead to inaccurate learning and, ultimately, a model that generates incorrect inferences. Poisoning can also violate one or more of availability, integrity, and confidentiality/privacy [3]. Here are two examples:

  • Consider a network fault identification and classification model. An attacker can inject multiple copies of seemingly benign samples into the training dataset. If an excessive amount of the same class of data is added, it can compromise the integrity of the output by affecting the convergence. If an abnormally high volume of data is injected beyond the expected load, the computation can potentially stall the pipeline and hence affect the system’s availability.
  • Consider a linear regression model that forecasts financial information. The attacker can infer the distribution of the financial inputs by manipulating the input data distribution and observing the output. This manipulation could compromise the confidentiality of the dataset, impacting the overall privacy of the data.

The risk of poisoning is greater in products where there is limited control over data collection, such as crowdsourcing or federated learning. The following are the potential  countermeasures in such scenarios: 

  • Assess the reliability of participants. Ensure that only verified members can provide inputs to the dataset [4]. Verification can involve the authorization of members contributing to the dataset and the authenticity of messages before they are used to train an ML model. 
  • Evaluate the load before training and preprocessing to ensure the pipeline won’t stall. Include exceptions when the data exceeds the expected distribution. For example, depending on the specific use case, it may be preferable to use a subset of the input data for training rather than the entire dataset. Assess class-wise distribution to detect data drifts. Outliers, particularly if they consistently come from a small subset of participants, could indicate an attack. Depending on the use case, this method may not distinguish natural drift from a poisoning attack. Nevertheless, it can alert the model operator to initiate an investigation. 
  • When crowdsourcing, use randomly selected subsets from a large community for training and testing datasets. This approach increases the amount of work the attacker must do to have enough poisoned data points included in the model. 

Evasion

Modifying a sample before inference could result in the model making inaccurate predictions. The modified sample is considered an adversarial example. Vulnerabilities to evasion attacks are common in image classification systems (usually neural networks). Goodfellow et al. [5] show how an image of a panda can be perturbed in such a way that  GoogLeNet misidentifies it as a gibbon (a type of ape) with 99.3% confidence. Both images before and after perturbation look identical to the human eye. Note that the perturbation is applied to the sample provided to the model without any modifications to the model’s parameters or weights.

An evasion attack is a direct violation of integrity. Depending on how a system is designed, this vulnerability can also violate confidentiality (if the model is used to authorize access) or availability (such as a virus/worm that bypasses malware checks by warping its signature). A detailed outlook on various evasion attacks has been studied in [6].

Specific risk mitigation for evasion attacks depends on the application that uses the ML model, such as malware detection, phishing detection, Internet of Things (IoT), smart grids, etc. Nevertheless, studying the threat surface can help to identify possible treatments to reduce this risk. Start by answering the following questions:

  • Where in the pipeline can a possible perturbation occur?
  • What is the data source for the example used for inference and validation?
  • Has the pathway from the source to the inference engine been secured?

For instance, for an image classification scenario, such as drone-based infrastructure monitoring (Refer to a previous blog post  RE: drone monitoring), perturbations can occur in the image samples used for inference. In a system that applies face detection, the data source is the camera. Ensure that the data flow from camera capture to data storage and then to inference feed is secure. In situations where you have minimal control over the data source, you may apply certain transformations to the data [7].

Oracle

An oracle attack in ML, or model extraction/stealing attack [8], involves an attacker attempting to extract the internal parameters or architecture of a model by querying it to infer its decision boundaries. The goal is to recreate a copy of the model and potentially use it for malicious purposes, such as stealing sensitive information or intellectual property. This type of attack can happen when the attacker has access to the model’s predictions but not the training data or the model’s parameters. The attacker can construct a specific set of queries to the model to infer the underlying logic or data. The information gained from an oracle attack can also be used to facilitate other types of attacks. 

For example, an attacker could attempt to extract the internal parameters or architecture of a machine learning model used for customer churn prediction or propensity modeling by querying it with carefully crafted input data and observing the model's outputs. The attacker's goal could be to recreate the model and potentially use it to gain insights into the telecom operator's customer base, usage patterns, or sensitive business information, violating customer privacy and intellectual property rights.

Oracle attacks exploit the availability of the model. One simple defense is to limit access to the model, similar to defending against denial of service (DoS) attacks. The success of an oracle attack depends on the number of queries the attacker can submit. Throttling query access can slow or prevent an attack. Frequent model updates can also help by changing the model before an attacker can submit enough queries. It’s important to note that when applying transfer learning to a publicly available pre-trained model  (such as ResNet50 trained on ImageNet [9]) or a model used by another entity, the vulnerability may persist despite restricting access to the model. Here, the attacker can gain access to the base pre-trained model and have sufficient time and access to create a set of curated adversarial examples and compare them to the target system. This can be done even if the attacker only has a few opportunities to test instances and observe the output.

Adversarial reprogramming

An attacker can repurpose a model to be used for a task different from its intended task. This term was again popularized by Goodfellow [10], when his team repurposed ImageNet object classification models to perform a counting task and a different classification task. Just like evasion, this attack does not require any change to the weights of the model. Once enough knowledge is acquired from the input and corresponding output of the system, for instance, through an oracle attack, an attacker can craft an adversarial program to create an adversarial example. When this example is passed through the model, the resulting mapping is defined by the attacker using the adversarial program.
While poisoning and evasion manipulate the model to make a mistake, adversarial reprogramming prompts the model to make a specific mistake chosen by the attacker for a different functional agenda. This agenda can range from the unauthorized use or theft of compute resources to secretly transmitting messages and treating systems as spies.

Protection of the model’s availability and parameters can help control the vulnerability of an adversarial reprogramming attack. The use of adversarial training methods like defensive distillation does not eliminate its vulnerability to reprogramming (although it might increase the compute required). Other ways to reduce the risk of such an attack are regularly updating the model to ensure that it remains resilient against new types of attacks and monitoring the model’s performance in real-world scenarios to detect any unusual behavior.

Traditional attacks

Data flows through multiple components in the pipeline, from ingestion to monitoring. An attacker can target the traffic flow to access confidential information or even participate in a man-in-the-middle attack. 
A model also undergoes various states as it passes through the pipeline. Once trained, it absorbs the data in the form of weights. In production, the model resides in a model storage, and it is served during inference. The model can be subjected to an attack at any point in these state transitions. 

Certain pipelines have feedback mechanisms to support the retraining of models. Of course, if feedback from many sources is considered, then a few malicious feedbacks need not necessarily inhibit the training. However, if the feedback flow is compromised, then an attacker is free to inject/malform any feedback.
Hardening the ML pipeline can defend against traditional attacks and possibly some AI-specific attacks. Possible security controls include (but are not limited to) the following.

  • Constraining listening services and bindings
  • Impose strict access control mechanisms
  • Traffic and data protection
  • Periodic vulnerability analysis
  • Secure coding practices
  • Security event logging
  • Ensure runtime environment security

A knowledge of vulnerabilities in software product security would require a course of its own. However, the Open Worldwide Application Security Project (OWASP) is keeping an up-to-date summary of the top 10 vulnerabilities that pertain to web applications [11]. Since more and more applications today are accessible through the web, the OWASP Top  10 is still relevant for nearly every consumer product.

Security pitfalls in AI product development

At times, it is natural for developers to overlook the generic security aspects of their AI products and fall into some of the following pitfalls.

Lack of security expertise in your team

Security might not be the top priority for a data scientist, machine learning engineer, or AI product developer. Unless there is a dedicated security expert in your team, you are bound to overlook some security or privacy flaw that may creep into development. You cannot expect the team to undergo the rigorous security training that is expected of a security professional. Nevertheless, there should be a basic security training program in place for all members, and a security expert to address any specific concerns or queries.
For instance, a data scientist may not be aware of the current TLS standards to protect the data in transit from one component to another. 

Missing out on security requirements at the start of a project engagement

When security requirements come mid-way through the project, it can create complications in the development and program management: Who can take ownership of the activity? Who has the expertise to drive these requirements? What impact will this have on the deadline  ? Even if you feel your project has little to do with security, at least have an early discussion so that all stakeholders will be aware of the possible security concerns to avoid unpleasant surprises.

Assuming security aspects will be taken care of by another team

Within a dedicated product development unit, the roles of each member are (mostly) well-defined. In this case, there would be a dedicated team to handle security-related issues. The program management expects a representative from this team to participate in product requirements discussions to close the security gap. However, when working with cross-functional teams, it is not always clear who will be responsible for specific functions. It is vital in such situations to set clear expectations on which team will handle the security function rather than assuming "it will be taken care of by another team".

Delaying security and privacy compliance activities toward the end

Envision this scenario: Your team has put in all their effort to polish a product based on their priority functional requirements. The product then goes through a security assessment. The report uncovers that some packages have critical vulnerabilities, and component integration has introduced some traffic threats. Your models should be encrypted in storage and there are some packages that you should not even use in production. In the worst case, this can lead to an overhaul of the entire product design. Solution: Shift left security. This practice involves integrating the best security practices into the product design at an early stage to avoid such issues. 

Using features without evaluating their sensitivity.

Let's say you demonstrate a successful proof of concept (PoC); everyone gets excited, and the sales team wants to show it to the customers as soon as possible to get feedback. In the assessment, you notice some sensitive fields you should not have used in your ML model, such as gender, region, and age. Once you remove these fields and retrain the model, you observe the model performance taking a hit – far below the promised value. The product ultimately fails to meet its requirements for industrialization. This issue can be averted if a privacy impact assessment is conducted for fields before considering them as features (or using them to derive features) for model fitting. 

This pitfall does not necessarily have to be ML model-related. In telecom, subscriber data is highly sensitive. Examples include phone numbers, mobile station international subscriber directory number (MSISDN), and international mobile subscriber identity (IMSI).  While these items may hold little significance for an ML model, it is important they should be masked with access control when an ML outcome is mapped to a subscriber. For instance, not all network operators need to access this data when using a dashboard for anomaly detection of cell traffic.

Overlooking trustworthy AI/ML within security and privacy

Trustworthy AI goes beyond security and privacy, delving into trust, safety, and explainability. According to the guidelines of the European Commission, trustworthy AI should be lawful, ethical, and robust [12]. Although AI security should be part of product security, unless AI trustworthiness is explicitly stated as a requirement, the product may be vulnerable to some of the ML attacks outlined at the beginning of this article. Ensure that the team you work with understands the implication of AI trustworthiness to ensure the success of your product. This aspect includes avoiding using customer’s data beyond their intended and stated purpose. Any further usage must follow only after formal consent from all parties involved. Also, give additional protection to prevent potential harm (either physical or psychological) that the system can cause to the end user.

Closing thought

Security of any kind cannot be achieved by ticking off a checklist – it is a process. Work on the following to help avoid such pitfalls.

  • Understand the implications of AI-specific security and address those gaps in a machine-learning pipeline.
  • Identify a minimal security training program for non-security professionals.
  • Discuss security requirements and members responsible as early as possible, preferably before the start of the project engagement.
  • Shift-left security and security by design.
  • Assess the privacy impact of your data before committing it to a model.
  • Be wary of AI-specific attacks like poisoning, evasion, and model/data access.

For AI to be useful, people need to think of it as another way of building products rather than an academic exercise. The goal is not just to prove a possibility or to publish a paper; it is to build a product that provides value to us and our customers. That means that the 'mundane' aspects are just as important as the data science.

If you find this article useful, consider referring to our extended paper [13].

Acknowledgments

We would like to thank Kalpana Angamuthu, Engineering Manager at BCSS BOS and Michael Liljenstam, Principal Researcher at Ericsson Research, for reviewing this work and sharing their valuable suggestions

References

[1] Garrison, Chlotia P., and Roderick B. Posey. "Computer security checklist for non-security technology professionals." Journal of international technology and information management 15.3 (2006): 7.
[2] Hyrum Anderson. "The practical divide between adversarial ML research and security practice: a red team perspective," Enigma, Feb 2021. 
[3] Wang, Chen, et al. "Poisoning attacks and countermeasures in intelligent networks: Status quo and prospects." Digital communications and networks 8.2 (2022): 225-234.
[4] Pavlos Papadopoulos, Will Abramson, Adam J Hall, Nikolaos Pitropakis, and William J Buchanan. 2021. Privacy and trust redefined in federated machine learning. Machine learning and knowledge extraction 3, 2 (2021), 333–356. 
[5] Goodfellow, Ian J., Jonathon Shlens, and Christian Szegedy. "Explaining and harnessing adversarial examples." International conference on learning representations (2015).
[6] Sagar, Ramani, Rutvij Jhaveri, and Carlos Borrego. "Applications in security and evasions in machine learning: a survey." Electronics 9.1 (2020): 97.
[7] Yuan, Xiaoyong, et al. "Adversarial examples: Attacks and defenses for deep learning." IEEE transactions on neural networks and learning systems 30.9 (2019): 2805-2824.
[8] Matthew Jagielski, Nicholas Carlini, David Berthelot, Alex Kurakin, and Nicolas Papernot. 2020. High accuracy and high-fidelity extraction of neural networks. In proceedings of the 29th USENIX Conference on security symposium. 1345–1362. 
[9] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). 770–778. 
[10] Elsayed, G. F., Goodfellow, I., & Sohl-Dickstein, J. (2018). Adversarial reprogramming of neural networks. International conference on learning representations (2019).
[11] OWASP Top 10. 
[12] European Commission. Ethics guidelines for trustworthy AI. (2019)
[13] ERHP Isaac, J Reno. AI product security: A primer for developers. arXiv preprint 2304.11087, 2023.
[14] Qin, Y., Carlini, N., Cottrell, G., Goodfellow, I., & Raffel, C. (2019, May). Imperceptible, robust, and targeted adversarial examples for automatic speech recognition.  International conference on machine learning (pp. 5231-5240). PMLR.

Dario Casella and Lawrence Lawson. "AI and privacy: Everything you need to know about trust and technology," Aug 2022. 

Anusha Mujumdar, Kristijonas Čyras, Saurabh Singh, and Aneta Vulgarakis. "Trustworthy AI: Explainability, safety, and verifiability," Dec 2020. 

The Ericsson Blog

Like what you’re reading? Please sign up for email updates on your favorite topics.

Subscribe now

At the Ericsson Blog, we provide insight to make complex ideas on technology, innovation and business simple.