Google drops Gemma 4 with Apache 2.0 license, finally listening to developers

Google drops Gemma 4 with Apache 2.0 license, finally listening to developers

10 0 0

Google’s Gemini models have gotten impressively better over the last year, but they still come with Google’s strings attached. If you want real freedom to tinker, you’ve been stuck with the Gemma line—and Gemma 3 launched over a year ago, which in AI land is practically ancient history.

Starting today, that changes. Gemma 4 is here, and it comes in four sizes optimized for running locally. More importantly, Google finally dropped its custom Gemma license in favor of Apache 2.0, which is what developers have been asking for all along. The old license had restrictions that made some commercial use cases awkward, and Google is acknowledging that was a mistake.

As with previous Gemma releases, these models are designed to run on local hardware. But “local” can mean very different things depending on your budget. The two larger variants—a 26B Mixture of Experts and a 31B Dense model—can run unquantized in bfloat16 on a single Nvidia H100 GPU. Sure, that’s a $20,000 accelerator, but it’s still a single card in your own machine. If you quantize them down to lower precision, they’ll fit on consumer GPUs, which is where things get interesting.

Google also claims to have focused on reducing latency. The 26B MoE model only activates 3.8 billion of its 26 billion parameters during inference, which means it spits out tokens much faster than similarly sized dense models. The 31B Dense variant prioritizes quality over speed, but Google expects developers to fine-tune it for specific use cases rather than rely on the base model for everything.

This is a solid move from Google, especially the license change. Apache 2.0 is permissive enough for almost any commercial or research project, and it removes the friction that made some teams hesitate to build on top of Gemma. The real question is whether the smaller variants will actually run well on consumer hardware, or if Google’s definition of “local” still assumes you have a datacenter in your basement. I’ll be testing that as soon as I can get my hands on the weights.

Comments (0)

Be the first to comment!