Google Unveils Gemini 2.5: AI Model with Human-Like Web Browsing Abilities

Google Unveils Gemini 2.5: AI Model with Human-Like Web Browsing Abilities
X
Gemini 2.5 Computer Use is a new Google AI model that can browse the web through a virtual browser to fill out forms, scroll, click, and type on the web as a human would.

Gemini 2.5 Computer Use is a new Google AI model that can browse the web through a virtual browser to fill out forms, scroll, click, and type on the web as a human would.

Google Gemini 2.5 Computer Use is based on the Gemini 2.5 Pro model architecture, and the company states that the model is “capable of interacting with user interfaces” and that it has better performance than its competitors, boasting lower latency and better benchmarking results.

Gemini 2.5 launch 2025 in a blog post that Gemini 2.5 Computer Use can “follow your instructions to perform web tasks that require complex navigation and interaction.” The tool can be accessed by developers on Google AI Studio and Vertex AI. The Gemini 2.5 features can type, click, scroll, open a dropdown menu, move the cursor or “hover,” and use keyboard shortcuts, all while controlling a virtual browser on the web like a person.

Google also posted demo videos showing the model performing a task, though these are sped up three times. In one of the demos, Gemini is instructed: “My art club brainstormed tasks ahead of our fair. The board is a mess, and I want you to help me organize the tasks into some categories I created. Go to stick-note-jam.web.app and make sure the notes are clearly in the right section. Drag them there if they aren’t.”

The model, Google claims, “outperforms leading alternatives on multiple web and mobile benchmarks.” However, Google AI browsing model currently only supports 13 actions. The model can only be used to interact with web browsers at the moment, and Google notes it does not yet support control of a desktop operating system.

Internally, Google teams are using the Google AI innovation for UI testing, which can help to dramatically reduce the time needed for software testing. Variants of the model are used to power Gemini’s agentic features in AI Mode in Search, the Firebase Testing Agent, and Project Mariner, a platform where users can task AI agents with performing tasks like research, planning, data entry, and more with natural language.

Next Story
Share it