Gemma 2, the latest iteration in the Gemma family of AI models, is now available for researchers and developers globally. Building on the success of its predecessors, Gemma 2 offers state-of-the-art performance and efficiency, accessible in both 9 billion (9B) and 27 billion (27B) parameter sizes.
This new version outperforms models of similar sizes and even competes with those significantly larger, providing superior performance that reduces deployment costs due to its compatibility with single GPU setups like the NVIDIA H100 Tensor Core GPU or TPU hosts.
Gemma 2’s architecture has been redesigned to maximize performance and inference efficiency, making it a standout in its class. The 27B model, in particular, is claimed to deliver exceptional performance, surpassing other models more than twice its size. Similarly, the 9B model outperforms its competitors in the same size category, such as the Llama 3 8B. These advancements in efficiency allow for significant cost savings, making high-performance AI more accessible and budget-friendly.
images courtesy of Google
The new model supports a wide range of hardware configurations, from high-end desktops and powerful gaming laptops to cloud-based systems. Users can experience Gemma 2 at full precision in Google AI Studio or leverage its quantised version for local performance with tools like Gemma.cpp. Integration with widely-used frameworks like Hugging Face Transformers, JAX, PyTorch, and TensorFlow ensures that Gemma 2 can seamlessly fit into existing workflows.
Gemma 2 is released under a commercially-friendly license, encouraging innovation and commercialization. Its compatibility with various AI frameworks allows developers to use their preferred tools without hassle. The model is optimised for NVIDIA-accelerated infrastructure and will soon support NVIDIA NeMo for further optimisation. Developers can fine-tune Gemma 2 using Keras and Hugging Face, with additional parameter-efficient fine-tuning options in the pipeline.
To aid developers in utilising Gemma 2, a new Gemma Cookbook offers practical examples and recipes for building applications and fine-tuning models for specific tasks. The Responsible Generative AI Toolkit, including the open-sourced LLM Comparator, supports responsible AI development by helping evaluate language models comprehensively. Additionally, text watermarking technology, SynthID, will be open-sourced to further aid in responsible AI deployment.
Why is this important?
Following the initial success of the Gemma models, which saw over 10 million downloads, Gemma 2 is set to accelerate development in even more ambitious AI projects. Future developments include a 2.6B parameter model to bridge the gap between lightweight accessibility and powerful performance.
Gemma 2 will be available in Google AI Studio, Kaggle, and Hugging Face Models, with further access options through Vertex AI Model Garden and Google Cloud credits for academic researchers.