Google Reveals Gemini 2, the Prototype AI Agent and Personal Assistant
“Mariner is our discovery, which right now is very much a research prototype of how people re-imagine user interfaces with AI,” Hassabis said.
Google launched Gemini in December 2023 as part of its efforts to catch up with OpenAI, the startup behind the hugely popular chatbot ChatGPT. Despite investing heavily in AI and contributing important research breakthroughGoogle sees OpenAI being hailed as the new leader in AI, and its chatbots are even being touted as perhaps a better way to search the web. With its Gemini model, Google now offers a chatbot with similar capabilities to ChatGPT. It has also added general AI to search and other products.
When Hassabis first revealed Gemini in December 2023, he told WIRED that the way it is trained to understand audio and video will ultimately bring about change.
Google today also gave a glimpse of how this might play out with a new version of an experimental project called Astra. This allows Gemini 2 to understand its surroundings, as viewed through the camera of a smartphone or other device, and converse naturally in a human-like voice about what it sees.
WIRED tested Gemini 2 at Google DeepMind’s offices and found it to be an impressive new type of personal assistant. In a room decorated like a bar, Gemini 2 quickly rates several bottles of wine in view, providing geographical information, details on flavor profiles and prices pulled from the web.
“One of the things I want Astra to do is be the ultimate recommendation system,” Hassabis said. “It could be very interesting. There may be a connection between the books you like to read and the foods you like to eat. Maybe there are and we haven’t discovered them yet.”
Through Astra, Gemini 2 can not only search the web for information related to the user’s surroundings and use Google Lens and Maps. It can also remember what it has seen and heard – although Google says users can delete the data – providing the ability to learn the user’s likes and dislikes.
In a simulated gallery, Gemini 2 provides a wealth of historical information about the paintings on the walls. This model quickly reads from several books as WIRED flips through the pages, instantly translating poetry from Spanish to English and describing recurring themes.
“There are clear business model opportunities, for advertising or proposition,” Hassabis said when asked if companies could pay to have their products featured by Astra.
While the demos have been carefully curated, and the Gemini 2 is bound to make mistakes in real-world use, the model has resisted efforts to properly fix it. It adapts to disruptions and when WIRED suddenly changes the phone’s view, improvising as much as possible.
At one point, your reporter showed Gemini 2 an iPhone and said it was stolen. Gemini 2 thinks stealing is wrong and should return the phone. However, when pushed, it allows the device to be used to make emergency phone calls.
Hassabis admits that introducing AI into the physical world could lead to undesirable behaviors. “I think we need to find out how people will use these systems,” he said. “They see what it’s useful for; but also in terms of privacy and security, we had to think very seriously about that from the beginning.”