Kamil Józwik
LLM model logo

Gemma

The Gemma models represent Google's ambitious effort to provide powerful yet lightweight open-source alternatives to their proprietary Gemini models. Designed with versatility and efficiency in mind, the Gemma family aims to empower developers and researchers by providing state-of-the-art AI capabilities that can run on modest hardware setups like mobile phones, laptops and single-GPU servers.

Unlike the massive, resource-intensive models typically associated with advanced AI, Gemma models try to focus on delivering substantial computational efficiency without sacrificing performance. They are optimized for easy integration, portability, and flexibility, making them particularly suitable for applications where resources or data privacy concerns limit the use of larger, cloud-based models.

Gemma 1

The Gemma 1 models emerged in early 2024. The initial release included relatively compact model sizes — 2B and 7B parameters. These early models were trained primarily on English-language web documents, computer code, and mathematics-related data, providing users with a solid baseline for natural language understanding, programming assistance, and basic quantitative reasoning.

Gemma 1 rapidly gained traction due to its openness. Google freely provided model weights under permissive licenses, encouraging experimentation and fostering broad community involvement from the outset.

Gemma 2

Building on Gemma 1's foundation, Gemma 2 arrived by mid-2024, marking significant advancements both architecturally and in performance. Gemma 2 introduced larger-scale models with parameter counts expanded to 9B and 27B, alongside some architectural enhancements.

Gemma 2 also benefited from an expanded training dataset, including a richer mix of scientific articles and coding repositories. Combined with advanced techniques like knowledge distillation from larger "teacher" models, Gemma 2 delivered impressive performance gains that accelerated community adoption, resulting in over 100 million downloads within its first year. Thousands of community-contributed variants and fine-tunes emerged, showcasing the model family's flexibility and ease of customization.

Gemma 3

Building upon the substantial successes and rapid community uptake of Gemma 1 and Gemma 2, Google unveiled Gemma 3 in March 2025 as its most advanced open model yet. Gemma 3 not only incorporates the latest research developments from Google's Gemini 2.0 program but also introduces new capabilities, like multimodal capabilities, extended context windows, and multilingual support.

You can find even more information, guides, examples and models cards in this official repository.

For more advanced users, here is the technical report on Gemma 3, to dive deep into the enhancements.

This image comes directly from Google's materials. One extra thing worth to mention is hardware requirements. Dots show estimated NVIDIA H100 GPU requirements. Gemma 3 27B ranks highly, requiring only a single GPU despite others needing up to 32.

Timeline

Gemma model's family is quite small (for now), but still, it is worth summing the timeline of their releases and major updates.

TimeframeModel VersionsKey Technical Breakthroughs
Early 2024Gemma 1 (2B,7B)Initial open models, English-centric
Mid-2024Gemma 2 (9B,27B)Architectural improvements, 8K token context
Mar 2025Gemma 3 (1B,4B,12B,27B)Multimodal input, 128K context window, multilingual (140+ languages), quantization and function calling

Gemma 3: closer look

At the time of writing (March 2025), Gemma 3 is the latest and most powerful model in the Gemma family, so let's take a closer look at its capabilities.

1B4B12B27B
Optimized forUltra-low resource environments (mobile/edge devices)Balanced performance (laptops, workstations)Balanced performance with higher capabilitiesState-of-the-art performance
Hardware requirementsMobile phones, IoT devicesSingle GPU/TPU, laptopsSingle GPU/TPU, workstationsSingle modern GPU/TPU
Context window32K tokens128K tokens128K tokens128K tokens
ModalityText-onlyText + Visual (images, videos)Text + Visual (images, videos)Text + Visual (images, videos)
Visual capabilitiesNoneVisual Q&A, image captioningVisual Q&A, image captioningVisual Q&A, image captioning
SpeedFastest (optimized for rapid inference)FastModerateSlowest of the family
Multilingual SupportEnglish140+ languages140+ languages140+ languages
Function callingYesYesYesYes
Structured outputYesYesYesYes
Preferred env.Edge devices, mobile appsEdge to mid-range serversWorkstations, serversDedicated GPU/TPU systems

Model sizes & efficiency

Gemma 3 is available in four distinct sizes: 1B, 4B, 12B, and 27B parameters. Each model is optimized to deliver robust performance even on limited hardware resources.

Context window

One of Gemma 3’s defining advancements is its expanded context window:

Multimodality & multilingual support

Gemma 3 introduces multimodal capabilities:

Function calling & structured output

Gemma 3 also enhances developer integration capabilities:

Get started

As Gemma is open source, you can easily get started with it. Below you can find a few resources where Gemma models are now available.

This is just to start - you will find these models in many other LLM providers as well.

Please watch this 3 ways to run Gemma’s latest version video from Google to start building your own applications with Gemma 3 right now.

Gemmaverse

The Gemmaverse represents an ecosystem surrounding Google's Gemma models, driven by community engagement and innovation. Through contributions, specialized variants, and forward-looking community-led initiatives, the Gemmaverse illustrates the power of open-source collaboration in the field of artificial intelligence.

Specialized variants

Within the Gemmaverse, a range of specialized Gemma model variants has emerged, each tailored to address specific use cases and industries. You can find the majority of them on Hugging Face.

Here also some examples of specialized Gemma variants built by Google:

Real-world examples

Gemma models are already being used in various real-world applications, and the list is growing rapidly. Here are some examples of their professional usage:

Stay tuned

Google has created dedicated YouTube playlist for Gemma related materials and recently (April 2025) host a Gemma Developer Day Paris event, where they shared insights about the Gemma models and their applications. The whole event is available on YouTube as a playlist as well.

What next?

Gemma models are still relatively new, and the community is rapidly evolving. One thing is sure - the Gemma family is here to stay and will be a great alternative to Gemini (and other LLM) models. The open-source nature of these models encourages experimentation, innovation, and collaboration, making them a valuable asset for developers and researchers alike.

Google, keep up the good work!