In a major leap for AI-driven productivity, Google has launched Gemini 2.5 Computer Use, a cutting-edge model that can perform web tasks with human-like precision. Unlike traditional AI that only responds to text prompts, this model can navigate websites, interact with user interfaces, and complete online tasks autonomously.
Our new Gemini 2.5 Computer Use model is now available in the Gemini API, setting a new standard on multiple benchmarks with lower latency. These are early days, but the model’s ability to interact with the web – like scrolling, filling forms + navigating dropdowns – is an… pic.twitter.com/4PJoat9bwI
— Sundar Pichai (@sundarpichai) October 7, 2025
The new model, powered by Gemini 2.5 Pro, is designed to understand complex instructions and execute them through real-time actions such as clicking buttons, typing into forms, scrolling pages, and interacting with dropdown menus. In demo showcases, the AI has been seen organising digital sticky notes into user-defined categories, simulating the way a human would handle a cluttered virtual workspace.
Google emphasises that Gemini 2.5 is currently available to developers via Google AI Studio and Vertex AI, where it can be used for experimentation and automation. While it can perform 13 types of web interactions at present, the AI is not yet equipped to manage desktop applications or system-level controls.
The company also highlighted the model’s potential for software testing, noting that internal teams are already leveraging it to accelerate UI testing processes. Variants of Gemini 2.5 are integrated into other Google initiatives as well, including AI Mode in Search, the Firebase Testing Agent, and Project Mariner, which enables users to assign AI agents to handle tasks like research, planning, and data entry via natural language instructions.
