gemma 2 9b it LLM Model — Open LLM List

Introduction

Gemma is a family of state-of-the-art, open, text-to-text, decoder-only large language models developed by Google. These models are designed for various text generation tasks, including question answering, summarization, and reasoning. They are relatively lightweight, enabling deployment on resource-constrained environments like laptops, desktops, or personal cloud infrastructure.

Architecture

Gemma models are based on the same research and technology used to create the Gemini models. They are available in English, with open weights for both pre-trained and instruction-tuned variants. The models are designed to democratize access to AI technology and foster innovation across diverse environments.

Training

Gemma models were trained using a diverse dataset, including web documents, code, and mathematics, to cover a broad range of topics and linguistic styles. The 27B model used 13 trillion tokens, while the 9B model used 8 trillion tokens. Training employed JAX and ML Pathways with Tensor Processing Unit (TPU) hardware to enhance speed and efficiency.

Guide: Running Locally

Installation: Install the Transformers library:
```
pip install -U transformers
```

Pipeline API:

from transformers import pipeline
pipe = pipeline(
    "text-generation",
    model="google/gemma-2-9b-it",
    model_kwargs={"torch_dtype": torch.bfloat16},
    device="cuda",
)

Single/Multi GPU Execution:

from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("google/gemma-2-9b-it")
model = AutoModelForCausalLM.from_pretrained("google/gemma-2-9b-it", device_map="auto", torch_dtype=torch.bfloat16)

Alternative Precision: Use torch.float32 or quantize with bitsandbytes for 8-bit/4-bit precision.
CLI Usage: Follow installation instructions from the local-gemma repository.

Cloud GPUs: Consider using cloud services like AWS, Google Cloud, or Azure for access to powerful GPUs.

License

The Gemma models are subject to Google's usage license. Users must review and agree to the license terms before accessing the models on Hugging Face.

More Related APIs in Text Generation