The rise of AI voice and why telecom is its next big platform
- AI calling is evolving into a premium communication service provider (CSP) service delivered natively over the IMS network with differentiated latency, security, and QoS, as well as network‑native protection against AI‑driven fraud and deepfakes as a new trust layer that CSPs can monetize.
- Ericsson Ventures’ investments in Cartesia and Hiya highlight the network’s dual role in the AI voice era: enabling ultra‑low‑latency AI calling while safeguarding trust through native fraud protection.
For decades, Ericsson has maintained that communication is a fundamental human need. We have constantly striven to evolve our technology to enhance this connection, moving from fixed to mobile, and moving from voice to rich multimedia.
Yet, for over a century, the core voice experience remained remarkably stagnant. Since Ericsson deployed the first telephone exchange in 1880, voice tech has been a story of incrementalism. That era is officially over. We are currently witnessing a total paradigm shift: the rise of generative AI voice.
The industry has reached a point where AI-driven speech goes beyond basic intelligibility to become truly indistinguishable. This means natural turn-taking, emotional nuance, and the kind of fluid, real-time interaction once reserved for science fiction.
The thesis: voice as the new interface
At Ericsson Ventures, we have spent the last several years diving deep into the audio frontier. Our thesis is simple:
Voice is the next dominant human-machine interface (HMI), and the global telecom network is its next big platform.
In early 2025, we executed on our conviction by investing in Cartesia.
While many companies continue to optimize around diffusion- and transformer-based architectures, Cartesia is taking a fundamentally different approach. Built from first principles, the company’s founding team pioneered the State Space Model architecture: a breakthrough that delivers the lowest latency, most scalable, and most cost-effective AI model inference in the market, on cloud, on premises, or on devices.
AI calling with IMS
In parallel with the investment, Ericsson is integrating AI capabilities into IMS voice calling for around one billion subscribers globally served by Ericsson’s IMS solutions, with or without the data channel mechanism. Although the AI use cases are currently constrained by limited devices support for the data channel, Ericsson is already collaborating with AI companies to introduce early AI services, described below, that enable CSP monetization and enhance subscriber experiences.
In parallel, Ericsson is actively mobilizing the broader industry ecosystem to accelerate adoption of the data channel, laying the foundation to unlock the full potential of AI-native voice over time.
What is IMS data channel?
The IMS data channel, a 3GPP standard, represents a fundamental evolution of VoLTE and VoNR calling, extending the traditional voice calls with a synchronized, low-latency data path within the same call session. This allows voice to be augmented in real time with AI-driven capabilities, without pre-installed apps. By combining telecom-grade quality of service (QoS), deterministic latency, and network-level security with modern web and AI technologies, IMS data channel enables a new class of AI-native calling use cases. For voice, this means real-time translation, live captions, AI call agents, call screening, transcription, and intelligent call orchestration can all operate inside the call session itself, with consistent quality and reliability across devices and networks. In effect, the IMS data channel turns voice from a static transport service into a programmable, interactive platform—ready to support the next generation of AI-enhanced calling services at carrier scale. AI calling can be deployed before the IMS data channel, but its intuitive web menus significantly simplify service activation and content access—delivering a smoother experience than legacy platforms.
1) AI avatar agents for enterprise
This first AI use case introduces AI customer support agents by integrating Tavus’s photorealistic AI avatars and Cartesia’s ultra‑low‑latency text‑to‑speech directly into an IMS call session, all of it orchestrated by Daily’s Pipecat framework. Subscribers place a standard voice call to an enterprise support function, which is automatically upgraded with native video to present an AI avatar, without requiring any action from the caller.
Enterprises can also deploy branded AI avatar agents for outbound calls, creating a consistent and differentiated support experience. If customers enable their camera, the AI agent can also interpret visual cues such as facial expressions and body language, allowing it to adapt its responses to the emotional context of the conversation for a more intuitive and engaging interaction. The result is a far more intuitive and engaging support experience, bringing the efficiency of AI together with the trust and presence of face-to-face communication.
CSPs can deliver this use case today without the data channel, but with data channel more interactive features can be added.
2) Real time voice translation
Another use case is real-time voice translation, with live caption services natively embedded inside IMS voice calls. By leveraging Cartesia’s ultra-low latency AI voice models that can be deployed next to the mobile core, CSPs can deliver real-time, natural, bilingual conversations directly within the phone call—no separate app, no device processing limits, and no compromise on voice quality and latency.
This evolution enables natural, low latency two-way translation, live captioning and translation for accessibility, and automatic transcription and summarization to boost business productivity.
But there is a dark side: AI also makes fraud easier
As this innovation takes hold, the threat landscape is shifting too. AI has also made spam and fraud calls more convincing, with deepfake voices now appearing in a growing share of scam call attempts.
Hiya, another Ericsson Ventures portfolio company, published its State of the Call 2026 research showing how quickly the call fraud threat is escalating:
- One in three consumers globally say they have encountered a deepfake voice call; consumers now receive an average of 7.4 unwanted calls per week globally, a number growing by approximately 16 percent annually since 2023.
- Eighty six percent of unknown calls go unanswered, reflecting collapsing trust in the traditional voice channel.
- Thirty eight percent of mobile subscribers say they would consider switching providers if they feel unprotected from AI scams.
This is why voice trust, once taken for granted, is now one of the telecom industry’s most urgent challenges.
1) Spam call: Protection is no longer optional
As AI-driven fraud grows more sophisticated, network-level voice security is quickly becoming a core requirement for CSPs to protect both subscribers and their own customer retention. Ericsson partners with leading ecosystem providers such as Hiya through a solution known as Call Qualification, a network native capability integrated directly into the Ericsson IMS, enabling real-time detection, labeling, and blocking of spam calls across all devices.
2) Branded call: Rebuilding trust and boosting enterprise reach
AI-powered fraud does not only impact consumers, it also deeply affects businesses that rely on outbound voice. With 86 percent of unknown calls going unanswered, legitimate enterprises are increasingly struggling to reach their customers.
Hiya’s research shows:
- consumers are far more likely to answer when they can see verified caller identity
- most unknown calls are ignored entirely, making caller verification essential
- businesses report lost sales opportunities, slower deal velocity, lower customer satisfaction, and brand trust erosion when their calls go unanswered
This is where another capability gets integrated into the Ericsson IMS: the Branded Call service which is provided by combining the efforts from Ericsson and Hiya to boost trust in enterprise outbound calls. For enterprises, this is no longer just a marketing tool—it is an operational necessity for restoring trust in the voice channel.
Hiya has been working with Ericsson to bring branded calls to the CSPs.
The ultimate use case: network-native personal AI voice assistant
Ericsson, together with companies such as Cartesia and Hiya, is actively exploring a self-service AI calling capability that enables CSPs to offer personal AI voice assistants at network scale. Unlike over-the-top applications, this assistant is embedded directly within the IMS call session, allowing AI functionality to be delivered as a native network service.
A network-native AI call agent can answer incoming calls on a subscriber’s behalf based on configurable user policies, qualifying calls for intent and risk before connecting them. It can conversationally challenge suspicious callers, detect scam patterns, including impersonation attempts using deepfake, and only escalate calls that meet a predefined trust threshold. For legitimate interactions, the agent can generate real-time transcripts, produce concise summaries and action items, and attach them to the call log for later review.
These capabilities allow CSPs to offer premium AI calling tiers that combine call protection, productivity enhancements, and privacy controls, delivered seamlessly within the IMS voice experience rather than as a separate app layer.
Subscribers remain fully in control. They can opt in to call answering and qualification features, activate the AI agent during a call using a keyword or simple on screen interaction, and ensure the agent only listens when explicitly enabled. This approach places transparency and user consent at the core of the experience.
By introducing these capabilities, CSPs can launch new premium differentiated calling experiences, underpinned by telecom-grade latency, security and quality of service, setting a new benchmark in the industry for speed, reliability, and user-centric innovation.
Telecom is AI voice’s next big platform
For the last two decades, CSPs have played a critical role in deploying mobile broadband at global scale. At the same time, innovation and monetization have increasingly concentrated on the application and platform layers, driven mainly by over-the-top players. As a result, a significant share of the value creation in the app economy shifted toward technology platforms rather than network operators.
AI voice is different.
For the first time in years, CSPs have a structural advantage. By deploying voice models at the edge of the telecom network, we eliminate the latency tax, transforming voice from a best-effort web service into a deterministic, instantaneous human experience. By integrating these models natively, we do not just gain speed; we also inherit the hardened, carrier-grade security and five nines reliability that have been the gold standard of global communication for over a century. This, combined with native, network-level spam protection and identity verification, offers a level of integrity and safety that fragmented, over-the-top applications simply cannot replicate.
In the high-stakes race to dominate the voice interface, the CSP that controls the network controls the experience. The network is no longer a commodity. It is the differentiator.
Building a new trust layer for voice
As 2026 unfolds, the winners in this new era will be the CSPs that embrace both sides of the equation:
- AI-native voice experiences
- AI-powered voice security and branded identity
Together, these form a new trust layer for global voice communication, protecting users while unlocking new monetization for CSPs. This future is already unfolding, and it will redefine how billions of people communicate every day.
Ericsson will showcase several AI voice use cases at the 2026 Core Network Summit in Rome in May. Please check out the event page for more information and for videos of demos after the event.
Further reading
- For more details on IMS interactive calls, please check out Ericsson’s 5G voice offering.
- Learn how Ericsson Ventures backs leading companies to drive innovation and accelerates our core business.
- Learn how mobile networks and AI power each other.
- Discover what makes IMS a booster of trusted answered calls rates.
RELATED CONTENT
Like what you’re reading? Please sign up for email updates on your favorite topics.
Subscribe nowAt the Ericsson Blog, we provide insight to make complex ideas on technology, innovation and business simple.