NVIDIA Kicks Off Blackwell Ultra B300 Mass Production — 50x Hopper Throughput per Watt
Summary: NVIDIA started mass production of its Blackwell Ultra B300 GPU on June 12 — the next data-center generation designed explicitly for agentic AI inference and long-context reasoning workloads.
Key Facts
- 288 GB HBM3e on the B300, up from 192 GB on the B200 — essential headroom for running larger models and longer context windows
- DGX B300 delivers 192 petaFLOPS for inference and 70 petaFLOPS for training
- 50x higher throughput per megawatt and 35x lower cost per token vs. NVIDIA Hopper on low-latency agentic workloads
- Production start triggers the H2 2026 data-center build cycle; first deployments expected in enterprise AI clusters before year-end
Why It Matters
As frontier AI shifts from chatbots toward multi-step reasoning agents, inference efficiency — not just peak benchmark scores — determines which providers can profitably serve at scale. The B300's economics change the math on what it costs to run an AI agent continuously, directly affecting the price floor for AI services and the competitive position of hyperscalers racing to build out capacity.
Read More
- NVIDIA Blackwell Ultra AI Factory Platform — NVIDIA Newsroom
- Nvidia Blackwell B300 Mass Production Begins — TechnoSports