First Impressions of ChatGPT’s Enhanced Voice Mode: Fun and a Little Scary
I leave ChatGPT Turn on enhanced voice mode when writing this article as an ambient AI Companion. Occasionally, I’ll ask it to provide a synonym for an overused word or some words of encouragement. About half an hour later, chat interrupted our silence and started talking to me in Spanish, without prompting. I chuckled a little and asked what was going on. “Just a little change? Gotta keep things interesting,” said ChatGPTnow back in English.
While testing Enhanced Voice Mode as part of the first alpha, my interactions with ChatGPT’s new audio feature were fun, messy, and surprisingly varied, though it’s worth noting that the features I had access to were only half of what OpenAI demonstrated at launch. GPT-4o Model in May. The view we saw in the livestream demo is now scheduled for release later, and the improved Sky voice, She performer Scarlett Johanssen pushes back has been removed from Enhanced Voice Mode and is no longer an option for users.
So how does it feel now? Right now, Advanced Voice Mode is reminiscent of when the original text-based ChatGPT was killed off, in late 2022. Sometimes it leads to unimpressive dead ends or becomes an AI cliché. But sometimes, low-latency conversations flow in a way that Apple’s Siri or Amazon’s Alexa never could for me, and I feel compelled to keep chatting for the fun of it. This is the kind of AI tool you’d show your loved ones on vacation for a laugh.
OpenAI gave some WIRED reporters access to the feature a week after the initial announcement but removed it the next morning, citing security concerns. Two months later, OpenAI soft-launched Enhanced Voice Mode to a small group of users and released GPT-4o system cardA technical document outlining the red team’s efforts, what the company considers to be a safety risk, and the mitigation steps the company has taken to minimize harm.
Curious to try it out for yourself? Here’s what you need to know about the broader rollout of Enhanced Voice Mode and my first impressions of ChatGPT’s new voice feature to help you get started.
So, when is the full rollout?
OpenAI launched its audio-only Enhanced Speech Mode to some ChatGPT Plus users in late July, and the alpha group appears to still be relatively small. The company plans to enable it for all subscribers sometime this fall. OpenAI spokesperson Niko Felix did not share any further details when asked about the release timeline.
Screen sharing and video were a core part of the original demo, but they’re not available in this alpha. OpenAI plans to add those aspects later, but it’s unclear when that will happen.
If you are a ChatGPT Plus subscriber, you will receive an email from OpenAI when Enhanced Voice Mode is available to you. Once it is in your account, you can switch between Standard And High level at the top of the app screen when ChatGPT’s voice mode is open. I was able to test the alpha version on iPhone as well as one Galaxy Fold.
My first impressions of ChatGPT’s enhanced voice mode
Within the first hour of talking to it, I knew I loved it. Disconnect ChatGPT. This isn’t how you talk to humans, but the new ability to interrupt ChatGPT mid-sentence and request a different version of the output feels like a dynamic improvement and a standout feature.
Early adopters who were excited about the original demo may be disappointed to be presented with a limited version of Advanced Voice Mode with more hurdles than expected. For example, while the AI that generates the vocals is a key component of the launch demo, with Lullabies and many voices try to harmonizeAI track is not available in alpha version.