Skip to main content
Join Community

Search AI Workflow Pro

Search tools, categories, stacks, and pages

release

Introducing computer use in Gemini 3.5 Flash

This enables developers to build AI agents that automate any visual interface, bypassing the need for native API support and accelerating workflow automation for legacy or custom software.

Google DeepMind··1 min readrelease
releaseIntroducing computer use in Gemini 3.5 Flash
deepmind.google

What happened

Google DeepMind announced that Gemini 3.5 Flash now supports computer use, allowing the model to directly interact with graphical user interfaces such as browsers and desktop applications. Instead of relying on APIs, the model can observe screen content via screenshots and perform actions like clicking buttons, typing text, and navigating menus. This capability is available through Google AI Studio and the Gemini API. For developers building AI workflows, this means they can automate tasks that involve legacy or no-API software without writing custom integrations. The feature positions Gemini as a more versatile tool for creating autonomous agents that can handle complex, multi-step processes in real-world applications. According to Google DeepMind, the computer use function is designed for transparency and safety, with users able to see the model’s reasoning at each step. Early adopters have used it for data entry, web research, and software testing. This release lowers the barrier for building practical AI agents, as developers can now teach models to use any software visually, similar to how a human would.

Key takeaways

  • Gemini 3.5 Flash can now take screenshots and perform GUI actions like clicking and typing.
  • The feature reduces the need for API integrations to automate software.
  • Available via Google AI Studio and the Gemini API for developers.
  • Users can view the model's reasoning steps for transparency and safety.
  • Practical applications include data entry, web research, and software testing.

Why it matters

This enables developers to build AI agents that automate any visual interface, bypassing the need for native API support and accelerating workflow automation for legacy or custom software.

This is an original editorial digest by AI Workflow Pro. Full reporting at the source:

Read the original on Google DeepMind
Share this story
Share on X

More AI news

All news →

Join the AI Workflow Pro Community

Join Free