gemma 2 2b it

google

Introduction

Gemma is a family of lightweight, state-of-the-art open models from Google, designed for text-to-text, decoder-only tasks. Available in English, these models are suitable for diverse text generation tasks like question answering, summarization, and reasoning. Their compact size facilitates deployment in resource-constrained environments, such as laptops or cloud infrastructure.

Architecture

Gemma models are built using the same research and technology that underpins the Gemini models. They are designed to function as decoder-only large language models with open weights for both pre-trained and instruction-tuned variants. The models are trained on a diverse dataset, including web documents, code, and mathematical texts, leveraging Tensor Processing Unit (TPU) hardware for efficient training.

Training

Gemma models were trained on datasets comprising trillions of tokens from a range of sources, predominantly in English. The training utilized Google's TPU hardware and was conducted with the JAX and ML Pathways frameworks, ensuring efficient processing and handling of large-scale data. The models were rigorously evaluated for ethics and safety, ensuring compliance with Google's content policies.

Guide: Running Locally

Basic Steps

  1. Install Dependencies:

    pip install -U transformers accelerate bitsandbytes
    
  2. Load Model and Tokenizer:

    from transformers import AutoTokenizer, AutoModelForCausalLM
    tokenizer = AutoTokenizer.from_pretrained("google/gemma-2-2b-it")
    model = AutoModelForCausalLM.from_pretrained("google/gemma-2-2b-it", torch_dtype=torch.bfloat16)
    
  3. Inference with GPU:

    import torch
    model.to("cuda")
    input_text = "Write me a poem about Machine Learning."
    input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
    outputs = model.generate(**input_ids, max_new_tokens=32)
    print(tokenizer.decode(outputs[0]))
    

Cloud GPUs

For enhanced performance, consider using cloud GPUs such as those provided by Google Cloud's TPU or AWS EC2 P3 instances.

License

The Gemma models are available under the Gemma license. To access and use these models on Hugging Face, users must review and agree to Google’s usage license.

More Related APIs in Text Generation