Google puts AI agents at the center of Gemini update

Reuters || Shining BD

Published: 12/11/2024 6:16:38 PM
Gemini logo is seen in this illustration taken May 20, 2024. REUTERS/Dado Ruvic/Illustration/File photo

Gemini logo is seen in this illustration taken May 20, 2024. REUTERS/Dado Ruvic/Illustration/File photo

Alphabet's Google on Wednesday released the second generation of its artificial intelligence model Gemini and teased a slate of new ways to use AI beyond chatbots, including through a pair of eyeglasses.

CEO Sundar Pichai in a blog post dubbed the moment as the start of a "new agentic era," referring to virtual assistants that can perform tasks with greater autonomy.

"They can understand more about the world around you, think multiple steps ahead, and take action on your behalf, with your supervision."

The releases underscore the methods by which Google is aiming to reclaim the lead in the race to dominate the emerging technology. Microsoft-backed OpenAI captured global attention when it released chatbot ChatGPT in November 2022.

Google unveiled Gemini in December 2023 and now offers four versions.

On Wednesday, it released an update to Flash, its second cheapest model, with improved performance and added features to process images and audio. Other models will come next year.

OpenAI has in recent days announced a flurry of new offerings to diversify its prospects including a $200-a-month ChatGPT subscription for advanced research use and the availability of its text-to-video model Sora.

Google's play involves injecting its AI advances into applications that already enjoy widespread adoption. Search, Android and YouTube are among seven products that the company says are used by more than 2 billion people monthly.

That user base is a significant advantage over challenger startups such as search startup Perplexity, which is seeking a $9 billion valuation, and newer research labs like OpenAI, Anthropic or Elon Musk's xAI.

The Gemini 2.0 Flash model will power applications including AI Overviews in its search engine.

Alphabet's biggest bet is AI for search, Ruth Porat, the president and chief investment officer, said at the Reuters NEXT conference in New York on Tuesday.

Google also showed reporters new capabilities for Project Astra, a prototype universal agent which can talk to users about anything captured on their smartphone camera in real time.

The tool can now hold a conversation spoken in a mix of languages, as well as process information from Maps and image recognition tool Lens, DeepMind group product manager Bibo Xu told reporters.

And Astra will also be tested on prototype eyeglasses, the company's first return to the product area since the failure of Google Glasses. Others have since entered the market including Meta which in September unveiled an AR glasses prototype.

Google also showed reporters Project Mariner, a Chrome web browser extension which can automate keystrokes and mouse clicks in the vein of rival lab Anthropic's "computer use" feature, a feature to improve software coding called Jules, and a tool to assist consumers in making decisions like what to do or which items to buy in video games.

Shining BD