Google drops the entire Gemini Ultra 2 family into the open
On May 12, Google AI released the full Gemini Ultra 2 model suite under an Apache 2.0 license. Three sizes: 8 billion parameters, 70 billion, and a 405-billion-parameter behemoth. They're releasing base weights, instruction-tuned variants, and — here's the kicker — the complete training recipe, including data mixes and hyperparameter schedules. The 405B model reportedly matches GPT-5 on MMLU-Pro and beats it on MATH-500 by 4 points. On coding benchmarks, it's roughly on par with Claude 4 Opus. But the small 8B model, which can run on a single RTX 5090, outperforms Llama 3.1 8B by 15 percent on GSM8K. That's not a rounding error.
Why this moment exists: the post-closed-model landscape
For the past two years, Google has been playing catch-up in the open-weight race. Meta's Llama 3 and 4 series dominated the open ecosystem. Mistral's models carved out a niche for efficiency. Google's earlier Gemma series was solid but never set the benchmark. Meanwhile, the closed models — GPT-5, Claude 4, Gemini 1.5 Pro — kept pulling ahead. Google's own Gemini Ultra 1.0 was locked behind API paywalls. The strategic calculus changed when DeepSeek's R1 and V3 demonstrated that open models could match frontier performance. Google's leadership decided that open-sourcing their best work was the only way to maintain relevance in the developer community. The result is Gemini Ultra 2: a family that didn't just match the competition — it set a new bar for openness.
The real story is the training transparency — and what it means for competition
Honestly, the most interesting part isn't the model itself — it's that Google published the full training recipe. For the first time, we have a complete view of how a frontier-scale model was built: the exact data mixture, the learning rate schedule, the optimizer settings, and even the failure cases. This is unprecedented for a model of this size. The implications are twofold. First, it lowers the barrier for academic research — labs that couldn't afford to replicate a 405B training run can now study the recipe and adapt it for smaller scales. Second, it puts pressure on other big players. If Google can open-source a 405B model, why can't OpenAI? Meta might have to respond with an even more open release of Llama 5. For startups building on top of these models, the cost advantage is massive. If you're running a 50-person engineering team and paying $40k/month for API access, a 40 percent cost reduction from self-hosting a 70B model is not a rounding error.
What we still don't know about Gemini Ultra 2's safety and data provenance
Google's announcement was light on details about training data. They say it's filtered and deduplicated, but they haven't released the exact dataset composition. That matters for reproducibility — and for legal risk. Several copyright lawsuits against AI companies are still winding through courts. If Google used copyrighted data without explicit permission, the Apache 2.0 license doesn't shield downstream users. Also missing: a full safety evaluation. Google published standard red-teaming results, but independent researchers have already found that the 8B model can be jailbroken with simple prompt tricks. The 405B model's safety guardrails are reportedly stronger, but no one outside Google has verified that. And there's the question of inference cost. Running a 405B model requires at least 8 H100s or 4 B200s — that's not cheap. For many developers, the 70B model will be the sweet spot, but even that needs a multi-GPU setup. The open-source community will need to figure out quantization and pruning before these models become truly accessible.

