Unlock the Editor’s Digest for free
Roula Khalaf, Editor of the FT, selects her favourite stories in this weekly newsletter.
Google has launched a more advanced version of its Gemini artificial intelligence model that enables it to take actions on users’ behalf, as the US tech group races to bring AI-powered assistants to consumers.
The Silicon Valley giant on Wednesday also unveiled its vision of two “AI agents” powered by the new model, that can answer real-time queries across text, video and audio. These have been tested by a small group of users in the US and the UK over the past few months.
It comes as tech groups including OpenAI, Meta and Apple are rushing to launch AI-powered personal assistants, which can reason and complete complex tasks for people, as they look to generate revenue from their powerful but costly models.
On Wednesday, Apple also rolled out an update to its operating systems that marked its first big foray into generative AI. It includes giving iPhone users free access to OpenAI’s ChatGPT and its most advanced models via Siri, camera and writing tools. It timed the move with the launch of Apple Intelligence into markets outside of the US for the first time, including the UK.
Google would not confirm when it would release the prototypes — known as Project Astra and Project Mariner — more widely to consumers, but said it had moved an important step closer with these working versions.
“These are people in the real world, in a controlled environment . . . we want to start getting real world feedback as early as possible,” said Praveen Srinivasan, technical director at DeepMind, who worked on the Astra project.
Astra can be accessed either through a phone or via smart glasses, while Mariner can complete tasks on a user’s Chrome browser, including adding groceries from a recipe to an online shopping basket, filling out forms or planning travel itineraries.
Google’s showcasing of its improvements highlights how AI agents have become the latest front in the battle between tech companies.
In October, AI start-up Anthropic unveiled a tool that can conduct actions on users’ behalf, aimed at the developer market. Meanwhile, Google, Amazon, Meta and OpenAI are among those developing general-purpose agents that can be used by anyone. OpenAI recently said it believed that AI agents would hit the mainstream in 2025.
“Over the last year, we have been investing in developing more agentic models, meaning they can understand more about the world around you, think multiple steps ahead, and take action on your behalf, with your supervision,” said Alphabet chief executive Sundar Pichai.
In a live demo of the Astra assistant on a smartphone, the AI software answered a series of questions from the Financial Times about paintings that the phone camera was pointed at, including accessing memories of works it had seen recently. Due to its ten minutes of photographic memory, Astra was also able to memorise pages in a recipe book and then respond to questions about ingredients and wine pairings.
A video demonstration of Astra showed a user wearing a pair of glasses, acting as a camera, which he could activate by tapping on its side. The product was reminiscent of Google Glass, an ambitious but failed attempt at wearable technology that the tech group announced in 2012 and shelved three years later.
Mariner is a Chrome-based browser add-on that can read web pages, as well as type, click and scroll on your behalf. Megha Goel, product manager at Google, said the company had currently blocked off certain actions for users’ safety, such as purchasing items online or accepting cookies on their behalf.
Additional reporting by Michael Acton and Stephen Morris in San Francisco