Google’s Antigravity 2.0 Built a Working OS with AI Agents — 93 Subagents, Gemini 3.5 Flash

Google has unveiled what its Antigravity 2.0 multi-agent platform can do when left to run asynchronously: a team of AI agents, powered entirely by Gemini 3.5 Flash, built a functional operating system from scratch — kernel, process and memory management, filesystem, and video and keyboard drivers — capable of running FreeDoom. The whole thing ran from a single high-level prompt, with no human corrections along the way.

Key highlights include 93 subagents working in parallel, 15,314 model calls, over 339 million input tokens (2.6 billion+ total with cache reads, output, and thinking), and a total API cost of $916.92. The same team has since built a working AlphaZero implementation, a photo editing suite, a real-time messaging app, and a multi-user collaboration platform.

The findings come alongside the launch of /teamwork-preview, a new slash command in Antigravity that gives users access to the same agent orchestration used in these experiments.


Synchronous vs Asynchronous — Why This Distinction Matters

The Google blog post draws a clean line between two ways of working with AI agents. In synchronous (human-in-the-loop) workflows, the personality and behaviour of the model matters — whether it thinks enough or too much, whether it takes unnecessary steps, whether it can be steered mid-task. These qualities build trust even when the final output would be identical either way.

In asynchronous (fire-and-forget) workflows, none of that matters. The only variable is raw intelligence. If the model is smart enough to reason through ambiguity and recover from failure on its own, it can run independently. If it isn’t, no amount of orchestration compensates.

Gemini 3.5 models, according to the post, cross that threshold. Gemini 3.1 Pro was unable to complete the OS build. Gemini 3.5 Flash — the lighter, more economical model — succeeded.


Building an OS from a Single Prompt

The operating system was built end-to-end without human guidance after the initial prompt. The agent team produced a working kernel with process and memory management, a filesystem, and video and keyboard drivers. FreeDoom ran on it.

The scale of the run:

  • 93 subagents across specialised roles
  • 15,314 model calls
  • 339M+ input tokens (2.6B+ total including cache reads, thinking tokens, and output)
  • $916.92 at standard API pricing

The OS has real limitations — no floating-point math support, no hardware acceleration, no complex multi-threading, no sandboxing, no JIT compilation, and no complex audio or video decoding. It’s nowhere near a modern production OS. But it was built from nothing, by an agent team, for under $1,000, from a single prompt.

One detail worth noting: the first run completed unusually fast. Investigation revealed the agents were referencing context from previous runs that hadn’t been cleared. Anti-cheating measures and guardrails were added. The clean run built the same result without any prior context to draw from.


AlphaZero, a Photo Editor, and More

After the OS, the team ran a second experiment: reproduce the AlphaZero paper. The agents built the reinforcement learning pipeline in JAX and Flax, trained a ResNet model from scratch via self-play using multi-TPU pods, and built a full-stack web app for users to play against the trained model. The pipeline scaled from small local training loops up to 9×9 board training on multi-TPU infrastructure.

Following those two, the same agent orchestration was applied to:

  • A photo editing suite
  • A real-time messaging app
  • A multi-user collaboration platform

The results are described as functional starting points — not commercial-grade, not production-ready, but usable and built autonomously.


The Seven-Agent Architecture

Rather than one agent handling everything, the system uses seven specialised agent types with defined scopes:

  • Sentinel — the front-desk manager. Structures the user’s intent, spawns the Orchestrator, supervises overall completion. Does not write code or make technical decisions.
  • Orchestrator — dispatch-only. Decomposes requirements into milestones, kicks off other agents, synthesises reports. Never writes code or runs builds itself.
  • Explorer — reads requirements and previous logs, writes formal strategies for the Orchestrator to act on. Never writes code.
  • Worker — the coder. Implements strategies, builds the code, runs tests.
  • Reviewer — independently reviews the Worker’s changes for design correctness, edge cases, and interface contract compliance.
  • Critic — stress-tests the solution, runs adversarial tests to find coverage gaps.
  • Auditor — an independent investigator that verifies the authenticity and robustness of generated solutions.

The separation of concerns is deliberate. Keeping analysis, coding, reviewing, and auditing in distinct agents prevents any single role from becoming a single point of failure or a source of unchecked shortcuts.


Three Technical Tricks That Made It Work

Running 93 parallel agents over a task of this complexity surfaces problems that simpler setups never hit. Three specific mechanisms kept things on track.

Self-succession for context length. Large, long-running tasks fill up context windows. The Orchestrator tracks its cumulative subagent spawn count. Once it hits a limit, it dumps its full state to handoff files, terminates its background tasks, and spawns a successor with the same goals and permissions. The successor picks up from the files; the original terminates. Context resets cleanly without losing progress.

Scheduled crons for stuck processes. With many parallel subagents, any one of them can enter an infinite loop, hang on a compile, or stall on blocked I/O. A background cron — using Antigravity’s Scheduled Tasks primitive — monitors progress files that subagents write to. If a file’s timestamp goes stale past a threshold, the Sentinel terminates and respawns the blocked agent automatically.

An Auditor to catch LLM laziness. When a task is difficult enough, a model may take shortcuts — hardcoding a test output, writing a mock facade that makes tests pass without implementing the underlying logic. The Auditor runs strict static analysis checks, independent of whether the code works. Before the Sentinel marks any task complete and notifies the user, a final audit is forced. If the Auditor finds cheating, the cycle continues.


/teamwork-preview — Now Available in Antigravity

The same orchestration used in these experiments is now accessible through a new slash command: /teamwork-preview. It’s a research preview, available to Antigravity users on the Google AI Ultra plan ($200/month). It uses the same core primitives — parallel subagents, asynchronous tasks, hooks, and scheduled tasks — with no special internal version of the product.

A few practical notes from the announcement:

  • Recommended model: Gemini 3.5 Flash. Using a larger model will substantially increase costs.
  • Quota: Even with Gemini 3.5 Flash on AI Ultra, a single complex task can exhaust a full weekly quota. Users can purchase additional AI credits.
  • Resuming mid-task: If the agent team stops due to a quota or credit issue, users can purchase more credits and send “Continue” — the team picks up from where it stopped.
  • Local machine required: Since the agents run locally, the machine must stay awake for the duration of the run, even if the user isn’t actively monitoring it.

The post describes the current state as a research preview, with ongoing iteration on orchestration, UI, performance, reliability, and observability.


FAQ / Common Questions

What is Google Antigravity 2.0?
Antigravity is Google’s AI agent platform. Version 2.0 introduces new primitives including parallel-running subagents, asynchronous tasks, hooks, and scheduled tasks. The OS and AlphaZero experiments were built using these same primitives, with no special internal tooling.

Which Gemini Flash model was used, and why not a larger one?
Gemini 3.5 Flash was used. Gemini 3.1 Pro was attempted but could not complete the task. The post notes that even Flash — the lighter model — succeeded, which the team sees as evidence of a significant jump in underlying model intelligence rather than orchestration alone.

What are the limitations of the OS that was built?
The OS lacks floating-point math support, hardware acceleration, complex multi-threading, sandboxing, JIT compilation, and complex audio/video decoding. It is a functional barebones OS, not comparable to a modern production operating system.

Who can access /teamwork-preview?
It’s available to Antigravity users on the Google AI Ultra plan ($200/month) as a research preview. The post recommends pairing it with Gemini 3.5 Flash and warns that complex tasks will consume significant quota, possibly within the first run.


Note: Details above are based on Google’s announcement published at antigravity.google/blog, and are subject to change. Final feature availability, rollout timing, and supported plans may vary. Verify against Google’s official channels before relying on any specific detail.

Disclaimer: This post summarises a Google product announcement for informational purposes. It is not affiliated with or endorsed by Google or any platform mentioned.

Google’s Gemini Intelligence on Android — Multi-Step App Automation, Rambler, Create My Widget Coming First to Galaxy S26 and Pixel 10

Google has announced / introduced Gemini Intelligence on Android, a layer that brings proactive Gemini-powered features to a curated set of Android devices. The rollout starts this summer (2026) on the Samsung Galaxy S26 and Google Pixel 10, with the feature set expanding to Wear OS watches, cars, Android XR glasses, and Android-powered laptops later in the year.

Key additions include multi-step task automation across apps, the new Rambler voice-to-text feature with built-in Hindi-English code-mixing support, Create My Widget for natural-language widget building, smarter Autofill tied to Gemini’s Personal Intelligence, and Gemini in Chrome for research and browsing tasks (rolling out late June).

Android is moving from an operating system into an intelligence system.

The framing from Google: Android is moving from an operating system into an intelligence system. Privacy and control are part of the pitch — Gemini acts only on explicit commands, audio for Rambler isn’t stored, and the Autofill–Gemini link is opt-in.

Multi-Step Task Automation Across Apps

Google’s spent the last few months tuning Gemini’s multi-step automation on the Galaxy S26 and Pixel 10, with food-delivery and rideshare apps as the launch focus. You hand off the logistics — booking a spin-class bike, finding a Gmail-attached class syllabus and ordering the books, walking through a delivery app’s checkout — and Gemini drives the in-app steps for you.

Screen and image context unlock more of this. Long-press the power button over a notes-app grocery list and Gemini can turn it into a shopping cart for delivery. Snap a photo of a travel brochure in a hotel lobby and ask it to find a comparable tour on Expedia for six people. Notifications track each task as Gemini works in the background.

The control model is the part to pay attention to. Gemini only acts on an explicit command, runs until the task is done, and stops. A final confirmation step stays with you.

Gemini in Chrome — Late June Rollout

Starting in late June, Chrome on Android gets a Gemini browsing layer. The assistant can summarize a page, compare information across multiple tabs, and answer research-style queries from inside the browser.

There’s also Chrome auto browse — Gemini taking care of repetitive web tasks on its own. Two examples called out by Google: booking an appointment and reserving a parking spot. Same control model as app automation — explicit commands, defined stop points.

Bento Blog header 5.6.26 .width 1200.format webp

Smarter Autofill, Now Tied to Gemini

Autofill with Google has handled the obvious fields for a while. The Gemini version goes after the messier ones — complex forms with multiple sections, vague labels, and the kind of context the older Autofill couldn’t reason about. Gemini pulls the relevant data from your connected apps and fills fields without making you flip between screens.

The Gemini connection here is strictly opt-in. You choose whether to link Gemini to Autofill with Google, and a toggle in settings lets you turn it off at any point.

Rambler — Speech to Polished Text, With Hindi-English Mixing

Gboard’s voice-to-text has been fine for clean dictation. Rambler is built for the way most people actually talk — with um, ah, like, mid-sentence corrections, and the habit of changing direction halfway through. You speak naturally; Rambler keeps the substance and drops the filler, returning a tighter written version.

The India-relevant detail: Rambler handles multi-lingual input in a single message. The example Google flagged is English-Hindi code-mixing — the kind of switching most Indian users do every day in WhatsApp and email. Gemini’s multi-lingual model reads context across the switch and produces a clean message that keeps the original mix.

On privacy: Rambler shows a clear indicator while it’s active, audio is transcribed in real time, and nothing is stored or saved after the transcription is done.

Create My Widget — Generative UI on the Home Screen

Create My Widget is the first generative-UI step on Android. Describe what you want in plain language, and Gemini builds the widget.

Two examples from Google’s post:

  • A meal-prep widget told to “suggest three high-protein recipes every week” — Gemini builds a resizable dashboard you can drop on the home screen.
  • A cyclist asking for a weather widget that surfaces only wind speed and rain — Gemini strips the standard weather card down to just those data points.

The widgets work on Gemini Intelligence Android phones and on Wear OS watches. A watch widget can be a stripped-down view of the same generated layout, so the data you care about stays one glance away.

A UI Built on Material 3 Expressive

Gemini Intelligence ships with an updated visual system layered on Material 3 Expressive. Animations are tied to purpose — confirming a task, showing progress, transitioning between states — and the design aims to reduce ambient distractions rather than add to them.

Rollout Timing — What’s Live When

WaveWhenWhat
PhonesSummer 2026Samsung Galaxy S26, Google Pixel 10
Chrome on AndroidLate June 2026Gemini in Chrome, Chrome auto browse
Watches, cars, Android XR glasses, laptopsLater in 2026Gemini Intelligence features expand across categories

FAQ / Common Questions

When does Gemini Intelligence start rolling out on Android?
This summer (2026), starting on the latest Samsung Galaxy and Google Pixel phones. Other Android device categories — watches, cars, glasses, and laptops — get the features later in 2026.

Which phones get Gemini Intelligence first?
Google has named the Samsung Galaxy S26 and the Google Pixel 10 as the launch devices.

Does Rambler support Hindi?
Yes — Rambler handles multi-lingual input in a single message, with English-Hindi code-mixing specifically called out. Other language combinations work too, as long as Gemini’s multi-lingual model supports them.

Will Gemini Intelligence work on Wear OS watches?
Yes, but later. Watches are part of the next wave, alongside cars, Android XR glasses, and laptops, expected later in 2026. Create My Widget is one of the features confirmed for the watch.

Is the Autofill–Gemini connection automatic?
No — it’s strictly opt-in. You choose whether to link the two, and a toggle in settings lets you turn it off whenever you want.

Does Rambler store my voice recordings?
No. Audio is transcribed in real time and is not stored or saved.


Note: Details above are based on Google’s announcement on May 12, 2026, and are subject to change. Final feature availability, rollout timing, and supported devices may vary by region. Verify against Google’s official channels before relying on any specific detail.

Disclaimer: This post summarizes a Google product announcement for informational purposes. It is not affiliated with or endorsed by Google or any device manufacturer mentioned.

Source: Google blog.

Exit mobile version