Gemini 3.5 Flash: Google's New Flash Model Outperforms Last Year's Pro
Summary: Google's Gemini 3.5 Flash, unveiled at Google I/O on May 19, rewrites the Flash-tier playbook: it beats last year's Pro model across several key benchmarks while costing roughly 40% less and running four times faster.
Key Facts
- Benchmark reversals: Terminal-Bench 2.1 at 76.2% (vs. Gemini 3.1 Pro's 70.3%), MCP Atlas 83.6% (vs. 78.2%), Finance Agent v2 57.9% (vs. 43.0%)
- Speed: >280 output tokens/second per independent benchmarker Artificial Analysis — 4× faster than prior generation
- SWE-Bench score of 81.0% edges out Claude Opus 4.6 (80.8%)
- Pricing: $1.50 / $9.00 per million input/output tokens — ~40% cheaper than Gemini 3.1 Pro, though 3× pricier than Gemini 3 Flash
- Gemini 3.5 Pro was not released at I/O; Sundar Pichai said "give us until next month"
Why It Matters
When a Flash-tier model eclipses the prior generation's flagship on agentic and coding tasks, the cost curve for high-quality AI inference shifts meaningfully downward. Builders running production agentic pipelines can now reach last year's top-tier performance at budget-tier price points — compressing the time from prototype to viable product.
Read More
- Google Introduces Gemini 3.5 Flash at I/O 2026 — MarkTechPost
- Gemini 3.5 Flash benchmark deep-dive — Appwrite