Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More
Groq and PlayAI announced a partnership today to bring Dialog, an advanced text-to-speech model, to market through Groq’s high-speed inference platform.
The partnership combines PlayAI’s expertise in voice AI with Groq’s specialized processing infrastructure, creating what the companies claim is one of the most natural-sounding and responsive text-to-speech systems available.
“Groq provides a complete, low latency system for automatic speech recognition (ASR), GenAI, and text-to-speech, all in one place,” said Ian Andrews, Chief Revenue Officer at Groq, in an exclusive interview with VentureBeat. “With Dialog now running on GroqCloud, this means customers won’t have to use multiple providers for a single use case — Groq is a one stop solution.”
Groq powers first Arabic voice AI, expanding Middle East tech presence
Dialog is notable for being available in both English and Arabic, with the Arabic version representing the first voice AI specifically designed for the Middle East region. The inclusion of Arabic as one of the initial offerings was strategic for both companies.
“Arabic is the fourth most spoken language globally — by partnering with PlayAI to offer an Arabic TTS model, Groq is unlocking a key global market and enabling broader access to fast AI inference,” Andrews told VentureBeat.
The companies claim their solution addresses key shortcomings in existing voice AI technologies, particularly around natural speech patterns and response speed. According to benchmark testing conducted by third-party evaluator Podonos, Dialog was preferred by users at a rate of 10:1 versus ElevenLabs v2.5 Turbo and over 3:1 against ElevenLabs Multilingual v2.0.
Innovative ‘adaptive speech contextualizer’ transforms conversational AI
What sets Dialog apart is its sophisticated approach to context. Rather than treating each vocalization as an isolated event, the system maintains awareness of the entire conversation flow.
“We built a novel architecture that we call an ‘adaptive speech contextualizer‘ (ASC), which allows the model to use the full context and history of a conversation,” said Mahmoud Felfel, co-founder and CEO of PlayAI, in an interview with VentureBeat. “This means that every response isn’t just a standalone output; it’s enriched with appropriate prosody, tone, and emotion that reflect the flow of the conversation.”
For enterprises looking to implement conversational AI, latency — the delay between request and response — has been a persistent challenge. Groq’s specialized Language Processing Units (LPUs) appear to provide a significant advantage in this area.
“Based on initial internal testing, Groq is delivering up to 140 characters per second on PlayAI’s Dialog model, a significant boost compared to the same model running on GPUs at 86 characters per second,” explained Andrews. “That means that Dialog generates text up to 10 times faster than real-time.”
Groq secures $1.5 billion Saudi investment to build world-class AI infrastructure
The partnership comes at a time of significant expansion for Groq, which recently secured a $1.5 billion commitment from Saudi Arabia to fund additional infrastructure. The company has established a data center in Dammam, which it describes as “the region’s largest inference cluster.”
“Partnering with Groq was a no-brainer; they’re the industry leader in advanced AI inference infrastructure,” said Felfel. “With TTS and agents, low latency is key. We’ve already optimized Dialog for these real-time applications, but partnering with Groq allows us to deliver the lowest latency voice model on the market.”
The voice AI market has seen rapid growth as businesses look to automate customer interactions while maintaining a natural, human-like experience. Applications range from customer service and sales automation to voice-overs and accessibility features for the visually impaired.
Enterprise applications extend beyond traditional customer service use cases
“Beyond customer service, other enterprise use cases include automating sales and appointment scheduling, on-boarding and personal assistants, creating voice overs to existing content, translating English audio and video content into Arabic, increasing website and static content accessibility for the visually impaired, and more,” Andrews said.
For PlayAI, which was founded by entrepreneurs from the Middle East and North Africa region, the inclusion of Arabic language capabilities was particularly meaningful.
“As MENA founders, we know the region is heavily investing in AI capabilities and infrastructure as inflected in investments like Groq, but also world-leading adoption,” said Felfel. “Arabic is a global business language and one that we grew up speaking, so it was a natural choice as one of our core languages.”
The companies have made the Dialog technology available through GroqCloud’s tiered service model, which includes both free and paid options. This approach allows developers to experiment with the technology before committing to larger implementations.
“GroqCloud offers both free and paid plans. Anyone can create an account and create an API code for free,” Andrews explained. “Our paid Developer Tier is self-serve, meaning anyone with a credit card can sign up themselves.”
As voice becomes an increasingly important interface for AI systems, this partnership positions both companies to capitalize on the growing demand for more natural and responsive conversational experiences. By addressing the technical challenges of latency and natural speech patterns, Groq and PlayAI may have removed significant barriers to wider adoption of voice AI in enterprise settings.
READ SOURCE