gemma 2 9b it
googleIntroduction
Gemma is a family of state-of-the-art, open, text-to-text, decoder-only large language models developed by Google. These models are designed for various text generation tasks, including question answering, summarization, and reasoning. They are relatively lightweight, enabling deployment on resource-constrained environments like laptops, desktops, or personal cloud infrastructure.
Architecture
Gemma models are based on the same research and technology used to create the Gemini models. They are available in English, with open weights for both pre-trained and instruction-tuned variants. The models are designed to democratize access to AI technology and foster innovation across diverse environments.
Training
Gemma models were trained using a diverse dataset, including web documents, code, and mathematics, to cover a broad range of topics and linguistic styles. The 27B model used 13 trillion tokens, while the 9B model used 8 trillion tokens. Training employed JAX and ML Pathways with Tensor Processing Unit (TPU) hardware to enhance speed and efficiency.
Guide: Running Locally
-
Installation: Install the Transformers library:
pip install -U transformers
-
Pipeline API:
from transformers import pipeline pipe = pipeline( "text-generation", model="google/gemma-2-9b-it", model_kwargs={"torch_dtype": torch.bfloat16}, device="cuda", )
-
Single/Multi GPU Execution:
from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("google/gemma-2-9b-it") model = AutoModelForCausalLM.from_pretrained("google/gemma-2-9b-it", device_map="auto", torch_dtype=torch.bfloat16)
-
Alternative Precision: Use
torch.float32
or quantize withbitsandbytes
for 8-bit/4-bit precision. -
CLI Usage: Follow installation instructions from the local-gemma repository.
Cloud GPUs: Consider using cloud services like AWS, Google Cloud, or Azure for access to powerful GPUs.
License
The Gemma models are subject to Google's usage license. Users must review and agree to the license terms before accessing the models on Hugging Face.