LIVE
HuggingFaceHuggingFace launches CUGA: lightweight harness for agentic apps·OpenAIOmio Uses OpenAI to Build Conversational Travel Experiences·HuggingFacePP-OCRv6 Arrives on Hugging Face: 50 Languages, Tiny to Medium Models·OpenAISamsung equips 100,000+ employees with ChatGPT Enterprise·OpenAIOpenAI Rolls Out Spend Controls and Analytics for ChatGPT Enterprise·HuggingFaceMosaicLeaks Benchmark Exposes Research Agents' Inability to Keep Secrets·Google AIGoogle's AMIE Medical AI Matches Doctors in Disease Management·HuggingFaceMolmoMotion: Language-Guided 3D Motion Forecasting Hits HuggingFace·DeepMindDeepMind and UK government build AI prototype to speed housing decisions·HuggingFaceHugging Face lets you deploy robot policies from Hub to real hardware·OpenAIOpenAI's Deployment Simulation predicts model behavior before launch·Google AIGoogle invests $1.5B in Alabama data center expansion·OpenAIOpenAI launches Partner Network with $150M investment fund·OpenAIOpenAI launches three Agent Academy courses for workplace AI skills·DeepMindDeepMind's DiffusionGemma speeds text generation 4x·Google AIGoogle pours community funds into Virginia jobs and energy·OpenAIPreply uses OpenAI to generate AI lesson summaries for tutors·HuggingFaceHuggingFace Details PyTorch Profiling for Fused MLP Layers·DeepMindGemini 3.5 Live Translate delivers fluid natural speech translation·HuggingFaceHuggingFace benchmarks code-switched ASR: OpenAI, Google, Meta fail hard·HuggingFaceHuggingFace launches CUGA: lightweight harness for agentic apps·OpenAIOmio Uses OpenAI to Build Conversational Travel Experiences·HuggingFacePP-OCRv6 Arrives on Hugging Face: 50 Languages, Tiny to Medium Models·OpenAISamsung equips 100,000+ employees with ChatGPT Enterprise·OpenAIOpenAI Rolls Out Spend Controls and Analytics for ChatGPT Enterprise·HuggingFaceMosaicLeaks Benchmark Exposes Research Agents' Inability to Keep Secrets·Google AIGoogle's AMIE Medical AI Matches Doctors in Disease Management·HuggingFaceMolmoMotion: Language-Guided 3D Motion Forecasting Hits HuggingFace·DeepMindDeepMind and UK government build AI prototype to speed housing decisions·HuggingFaceHugging Face lets you deploy robot policies from Hub to real hardware·OpenAIOpenAI's Deployment Simulation predicts model behavior before launch·Google AIGoogle invests $1.5B in Alabama data center expansion·OpenAIOpenAI launches Partner Network with $150M investment fund·OpenAIOpenAI launches three Agent Academy courses for workplace AI skills·DeepMindDeepMind's DiffusionGemma speeds text generation 4x·Google AIGoogle pours community funds into Virginia jobs and energy·OpenAIPreply uses OpenAI to generate AI lesson summaries for tutors·HuggingFaceHuggingFace Details PyTorch Profiling for Fused MLP Layers·DeepMindGemini 3.5 Live Translate delivers fluid natural speech translation·HuggingFaceHuggingFace benchmarks code-switched ASR: OpenAI, Google, Meta fail hard·
Back
Google open-sources Gemini Ultra 2 family — 405B, 70B, 8B
Product/Google AI

Google open-sources Gemini Ultra 2 family — 405B, 70B, 8B

GA

Google AI

June 6, 2026

3 MIN

Original source

blog.google — read the full announcement →

Google drops the entire Gemini Ultra 2 family into the open

On May 12, Google AI released the full Gemini Ultra 2 model suite under an Apache 2.0 license. Three sizes: 8 billion parameters, 70 billion, and a 405-billion-parameter behemoth. They're releasing base weights, instruction-tuned variants, and — here's the kicker — the complete training recipe, including data mixes and hyperparameter schedules. The 405B model reportedly matches GPT-5 on MMLU-Pro and beats it on MATH-500 by 4 points. On coding benchmarks, it's roughly on par with Claude 4 Opus. But the small 8B model, which can run on a single RTX 5090, outperforms Llama 3.1 8B by 15 percent on GSM8K. That's not a rounding error.

Why this moment exists: the post-closed-model landscape

For the past two years, Google has been playing catch-up in the open-weight race. Meta's Llama 3 and 4 series dominated the open ecosystem. Mistral's models carved out a niche for efficiency. Google's earlier Gemma series was solid but never set the benchmark. Meanwhile, the closed models — GPT-5, Claude 4, Gemini 1.5 Pro — kept pulling ahead. Google's own Gemini Ultra 1.0 was locked behind API paywalls. The strategic calculus changed when DeepSeek's R1 and V3 demonstrated that open models could match frontier performance. Google's leadership decided that open-sourcing their best work was the only way to maintain relevance in the developer community. The result is Gemini Ultra 2: a family that didn't just match the competition — it set a new bar for openness.

The real story is the training transparency — and what it means for competition

Honestly, the most interesting part isn't the model itself — it's that Google published the full training recipe. For the first time, we have a complete view of how a frontier-scale model was built: the exact data mixture, the learning rate schedule, the optimizer settings, and even the failure cases. This is unprecedented for a model of this size. The implications are twofold. First, it lowers the barrier for academic research — labs that couldn't afford to replicate a 405B training run can now study the recipe and adapt it for smaller scales. Second, it puts pressure on other big players. If Google can open-source a 405B model, why can't OpenAI? Meta might have to respond with an even more open release of Llama 5. For startups building on top of these models, the cost advantage is massive. If you're running a 50-person engineering team and paying $40k/month for API access, a 40 percent cost reduction from self-hosting a 70B model is not a rounding error.

What we still don't know about Gemini Ultra 2's safety and data provenance

Google's announcement was light on details about training data. They say it's filtered and deduplicated, but they haven't released the exact dataset composition. That matters for reproducibility — and for legal risk. Several copyright lawsuits against AI companies are still winding through courts. If Google used copyrighted data without explicit permission, the Apache 2.0 license doesn't shield downstream users. Also missing: a full safety evaluation. Google published standard red-teaming results, but independent researchers have already found that the 8B model can be jailbroken with simple prompt tricks. The 405B model's safety guardrails are reportedly stronger, but no one outside Google has verified that. And there's the question of inference cost. Running a 405B model requires at least 8 H100s or 4 B200s — that's not cheap. For many developers, the 70B model will be the sweet spot, but even that needs a multi-GPU setup. The open-source community will need to figure out quantization and pruning before these models become truly accessible.

Watch video
Video thumbnail
Click to play

Frequently Asked Questions

How do the Gemini Ultra 2 models compare to GPT-5 and Claude 4?

On standard benchmarks, the 405B model matches GPT-5 on MMLU-Pro and surpasses it on MATH-500 by 4 points. It's roughly on par with Claude 4 Opus on coding tasks. The 70B model is competitive with GPT-4o in most areas, while the 8B model leads its size class by a significant margin.

Can I run the 8B model on my laptop?

Yes. The 8B parameter model fits comfortably on a single consumer GPU like an RTX 5090 with 32GB VRAM. With 4-bit quantization, it can even run on a MacBook Pro with 64GB of unified memory, albeit at reduced speed. That's a big deal for local AI applications.

Is the training data for Gemini Ultra 2 fully open?

No. Google released the training recipe — the data mixing ratios, hyperparameters, and optimization details — but not the actual dataset. They cite privacy and copyright concerns. For full reproducibility, researchers would need to reconstruct a similar dataset, which is nontrivial.

What license are the models released under?

Apache 2.0. That's one of the most permissive open-source licenses. You can use, modify, and distribute the models for any purpose, including commercial applications, without paying royalties. No restrictions on usage beyond standard Apache terms.

When will fine-tuning and deployment tools be available?

Google released basic inference code on GitHub alongside the model weights. Third-party frameworks like Hugging Face Transformers and vLLM are expected to add support within a week. Fine-tuning scripts using LoRA and QLoRA are promised in the next update, but no date is set.

↑ SWIPE FOR NEXT