Google Gemma 4 Launches With Four Sizes, Apache 2.0 License, and a Top-3 Open Model Ranking

Google's Gemma 4 arrives with model sizes from 2B to 31B, a permissive Apache 2.0 license, native multimodal support across all sizes, and the #3 spot on the global open model leaderboard.

Dr. Nova Chen★Apr 4, 2026★5 min read

The Most Capable Open Models Google Has Ever Released

Google DeepMind launched Gemma 4 on April 2, 2026, and the open model landscape shifted immediately. Four model sizes arrive simultaneously — Effective 2B (E2B), Effective 4B (E4B), 26B Mixture of Experts (MoE), and 31B Dense — covering the full spectrum from on-device smartphone inference to workstation-scale deployments. The 31B model currently ranks #3 on the global open model leaderboard; the 26B MoE model holds #6. For models anyone can download and run without restrictions, those rankings are remarkable.

Apache 2.0 Changes Everything

Previous Gemma releases carried custom use-restriction licenses that limited how developers could deploy the models commercially. Gemma 4 ships under Apache 2.0 — one of the most permissive open-source licenses available. This is a meaningful strategic shift. Apache 2.0 means developers can build commercial products on top of Gemma 4 without navigating Gemma-specific terms, and organizations can modify and redistribute the models freely.

The license change is a direct competitive move against Meta's Llama family, which carries its own custom license with commercial restrictions above certain user thresholds. Gemma 4's Apache 2.0 terms are simpler, cleaner, and more permissive for a wider range of use cases — a significant advantage for enterprise adoption.

Native Multimodal Across All Four Sizes

Every Gemma 4 model natively processes images and video, with variable resolution support built into the architecture. The two edge models — E2B and E4B — also include native audio input for speech recognition and understanding. Developers building voice-aware or vision-aware applications on-device no longer need a separate audio model; Gemma 4 handles it natively.

The 128K context window on the edge models and 256K on the larger models provide substantial capacity for long-document processing, multi-turn conversation, and extended reasoning chains.

Agentic Capabilities Built In

Gemma 4 is engineered for the agentic era. Native function-calling, structured JSON output, and system instruction support are built into the model architecture rather than bolted on. This gives developers the building blocks to construct autonomous agents that can call APIs, execute multi-step workflows, and interact with external tools reliably — without the prompt engineering overhead required to coax these behaviors from models not designed for it.

Performance-per-parameter is another standout achievement: the models run up to 4x faster than previous Gemma generations while consuming up to 60% less battery on mobile hardware. For developers targeting on-device deployment, those efficiency numbers meaningfully expand what is feasible at the edge.

Where to Get Gemma 4

Gemma 4 is available immediately on Hugging Face, Kaggle, and Ollama. The 31B and 26B models are accessible via Google AI Studio. The edge models are available through AI Edge Gallery for on-device deployment. The breadth of distribution channels reflects Google's commitment to making Gemma 4 genuinely accessible rather than technically open but practically difficult to access.

For developers evaluating open model options in 2026, Gemma 4 has earned a place at the very top of the shortlist.

Sources: Google Blog (April 2, 2026), Google Cloud Blog (April 2, 2026), The Next Web (April 2, 2026), SiliconAngle (April 2, 2026), The Register (April 2, 2026)