⚡ News

Gemini 3.5 Flash & Omni: Google's I/O 2026 AI Play

Google launched Gemini 3.5 Flash for agentic workflows and Gemini Omni for video generation at I/O 2026. Here's what each model does.

The AI Dude · May 20, 2026 · 8 min read

Two Models, Two Very Different Bets

Google DeepMind just shipped two new Gemini variants at I/O 2026: Gemini 3.5 Flash, built for complex agentic workflows, and Gemini Omni, a multimodal model that generates video from essentially any input. Both are available immediately in the Gemini app, Google Search, and Google's Antigravity agent platform (per Google's official model announcements at blog.google).

This is Google doing what Google does best — shipping infrastructure-grade AI across its entire product surface simultaneously. But the two models tell different strategic stories, and they're worth pulling apart.

Gemini 3.5 Flash: The Agent Backbone

Gemini 3.5 Flash is the successor to the 2.0 Flash line that became the workhorse behind most of Google's consumer AI features. The "3.5" jump in versioning signals this isn't a minor refresh — Google is positioning Flash as the model you build agentic systems on top of.

The key word in every piece of Google's messaging is agentic. Flash 3.5 is designed for multi-step, tool-using workflows where the model needs to plan, execute, observe results, and iterate. Think: a model that can browse the web, read documents, call APIs, and chain those actions together without a human in the loop at every step.

Antigravity Integration

The most interesting deployment detail is the tight integration with Antigravity, Google's agent platform. Antigravity provides the scaffolding — tool definitions, execution environments, safety guardrails — and 3.5 Flash provides the reasoning engine underneath. This is Google's answer to the agent frameworks that have proliferated in the open-source world (LangChain, CrewAI, AutoGen), except it's fully managed and wired directly into Google's services.

My read: the Antigravity angle matters more than the model itself. Everyone has a capable reasoning model now. The differentiation is in the platform layer — can you give the model access to Gmail, Calendar, Drive, Maps, and Search in a way that actually works reliably? Google has a structural advantage here that no other AI lab can match, because they own the services the agents need to interact with.

Where Flash 3.5 Fits in the Lineup

Google's Gemini family now has a lot of models. Here's how they slot together based on what Google has disclosed across its model pages:

Model	Optimized For	Position
Gemini 2.5 Pro	Complex reasoning, long context	Flagship thinker
Gemini 2.5 Flash	Speed + cost efficiency	High-volume inference
Gemini 3.5 Flash	Agentic workflows	Agent backbone
Gemini Omni	Multimodal generation (video)	Creative generation

The naming is getting confusing — 3.5 Flash sits alongside 2.5 Pro rather than replacing it. Google appears to be branching the version tree by capability rather than running a single version number forward. Whether this naming sticks or gets simplified later is anyone's guess.

Gemini Omni: Video From Any Input

Gemini Omni is the flashier announcement (no pun intended). This is Google's entry into the AI video generation space, and the pitch is broad: generate video from text, images, audio, or combinations of all three.

The "any input" framing is significant. Most video generation models — Runway Gen-3, Sora, Kling — start from text prompts or image-to-video workflows. Omni's multimodal input approach means you could theoretically feed it a slideshow, a voiceover, and a text description and get a coherent video out. At least, that's the promise per Google's announcement. How well it actually works at launch is something we'll need third-party evaluations to confirm.

How Omni Competes

The AI video generation market has gotten crowded fast. Here's the landscape Omni is entering:

Runway Gen-3 / Gen-4: The incumbent for professional creative workflows. Strong on controllability and consistency.
OpenAI Sora: High-quality output but limited availability and steep pricing.
Kling (Kuaishou): Competitive quality, aggressive pricing, popular in Asia-Pacific markets.
Veo 2 (Google): Google's own previous video model, integrated into YouTube Shorts and other products.

Omni's advantage isn't necessarily quality — we don't have independent benchmark comparisons yet. The advantage is distribution. If Omni is available directly in the Gemini app and Google Search, it reaches hundreds of millions of users on day one. Runway and Sora require separate accounts and workflows. Google can make video generation a feature of Search rather than a standalone product, and that changes the competitive dynamics entirely.

The Search Integration Angle

Google mentioned availability in Search alongside the Gemini app. This is worth pausing on. AI-generated video responses in Search would be a significant UX shift — imagine searching "how to tie a bowline knot" and getting a generated video demonstration instead of (or alongside) a YouTube result.

I think this is where Omni's real strategic value lies. Google has been gradually transforming Search from a link directory into an answer engine (AI Overviews, featured snippets, Knowledge Panels). Adding generated video is the next logical step, and it gives Google a reason to keep users on Google.com rather than clicking through to YouTube or TikTok.

What We Don't Know Yet

Both announcements are light on several details that matter to developers and businesses. Flagging these honestly:

Pricing: Google hasn't published API pricing for either model at the time of this writing. For Flash 3.5, the question is whether it's priced at the Flash tier (~$0.075 per million input tokens for 2.0 Flash) or higher given the agentic capabilities. For Omni, video generation pricing is anyone's guess — Sora charges per second of output, Runway charges credits.
Benchmarks: No third-party benchmark results are available yet. Google's own internal evaluations were referenced in the announcement, but independent scores on standard benchmarks (MMLU, HumanEval, SWE-Bench, etc.) haven't surfaced. These typically follow within days of launch as the community gets API access.
Context window: The 2.5 Flash and Pro models offered up to 1M tokens of context. Google hasn't confirmed whether 3.5 Flash matches or exceeds this.
Omni video length and quality: Maximum video duration, resolution, and frame rate haven't been detailed. These are the specs that determine whether Omni is a toy or a tool.
Rate limits and availability: "Available immediately" could mean anything from full API access to a waitlisted preview. The gap between announcement and actual developer access has varied wildly across Google's AI launches.

I'll update this post as pricing, benchmarks, and independent reviews come in. For now, treat the official announcements as the primary source.

The Bigger Picture: Google's Agent Strategy

Step back from the individual models and the pattern is clear. Google is building a full-stack agent platform:

Reasoning layer: Gemini 2.5 Pro for complex thinking, 3.5 Flash for agentic execution
Generation layer: Omni for video, Imagen for images, existing models for text and code
Platform layer: Antigravity for orchestration, with native access to Google's service APIs
Distribution layer: Gemini app (300M+ monthly users per Google's last disclosure), Search, Android, Chrome

This is a meaningfully different approach from OpenAI (which is building a consumer super-app) or Anthropic (which is focused on API-first enterprise). Google is doing what Microsoft did with Office — embedding AI into existing products that already have massive distribution.

The risk is complexity. Google now has Gemini 2.0 Flash, 2.5 Flash, 2.5 Pro, 3.5 Flash, and Omni all active simultaneously. For developers choosing a model for their application, the decision tree is getting unwieldy. OpenAI's simpler lineup (GPT-5, GPT-5-mini, o3) is easier to reason about, even if it's less capable in specific niches.

Who Should Pay Attention

If you're building agents: Watch 3.5 Flash closely, but wait for independent benchmarks before committing. The Antigravity integration could be a genuine advantage if you're already in Google's ecosystem. If you're building on AWS or Azure, the portability story matters — and Google hasn't clarified how much of Flash 3.5's agent capabilities are Antigravity-dependent vs. available through the raw API.

If you're in video/creative: Omni is worth tracking but probably not worth switching to on day one. The Runway and Sora ecosystems have more mature tooling. Omni's advantage is distribution, not (yet proven) quality. If you're a casual creator who already uses the Gemini app, it's a free upgrade. If you're a production team, wait for the specs.

If you're watching the market: The Flash 3.5 launch is the more strategically important of the two. Agentic AI is where the revenue will be in 2026-2027, and Google just declared that it's building the platform layer, not just the model layer. That's a direct challenge to every agent framework startup in the ecosystem.

The Honest Take

Google I/O announcements are always heavy on ambition and light on benchmarks. That's not a criticism — it's the nature of a developer conference keynote. The Gemini 3.5 Flash and Omni launches are genuinely significant because they signal where Google is investing, but the gap between "announced at I/O" and "reliably available in production" has historically been months, not days.

The strongest signal here isn't either model individually. It's that Google is building vertically — models, platform, distribution — in a way that no other AI company can replicate. Whether they execute on that advantage is the question that actually matters. History says Google is better at launching than at sustaining product focus, but the AI race might be the thing that finally forces them to follow through.

Gemini 3.5 FlashGemini OmniGoogle I/O 2026Gemini agentic modelsGoogle AI videoAntigravity

Share 𝕏 / Twitter Reddit LinkedIn

← Back to blog

Keep reading

News

AI21 Labs Cuts 60% of Staff, Bets on Maestro

AI21 Labs slashes over 60% of staff, drops foundation models, and pivots to its Maestro agent optimization platform after Nebius acquisition talks collapse.

News

Anthropic Acquires Stainless: What It Means for AI

Anthropic bought Stainless, the SDK generator behind OpenAI and Cloudflare's client libraries. Here's the strategic play for AI agents.

News

Why Anthropic Bought OpenAI's SDK Builder

Anthropic acquired Stainless, the SDK generator behind OpenAI and Cloudflare's APIs. The real play is owning the agent-to-tool layer.