granite 3.1 2b base

ibm-granite

Introduction

The Granite-3.1-2B-Base model, developed by the Granite Team at IBM, extends the context length of its predecessor from 4K to 128K using a progressive training strategy. This model is intended for various text-to-text generation tasks, such as summarization and question-answering, and supports multiple languages including English, German, and Japanese.

Architecture

Granite-3.1-2B-Base is a decoder-only dense transformer model. Key components include GQA and RoPE, MLP with SwiGLU, RMSNorm, and shared input/output embeddings. It consists of 40 layers, with 32 attention heads and a sequence length of 128K. The model includes 2.5 billion active parameters, utilizing a dense configuration.

Training

The model is trained using a three-stage strategy:

  • Stage 1: Diverse domain data, including web and academic sources.
  • Stage 2: High-quality data with multilingual and instruction data to enhance task performance.
  • Stage 3: Synthetic long-context data including QA/summary pairs.

Training is conducted on IBM's Blue Vela supercomputing cluster with NVIDIA H100 GPUs, processing 12 trillion tokens.

Guide: Running Locally

  1. Install Required Libraries:

    pip install torch torchvision torchaudio
    pip install accelerate
    pip install transformers
    
  2. Run Example Code:

    from transformers import AutoModelForCausalLM, AutoTokenizer
    device = "auto"
    model_path = "ibm-granite/granite-3.1-2b-base"
    tokenizer = AutoTokenizer.from_pretrained(model_path)
    model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
    model.eval()
    input_text = "Where is the Thomas J. Watson Research Center located?"
    input_tokens = tokenizer(input_text, return_tensors="pt").to(device)
    output = model.generate(**input_tokens, max_length=4000)
    output = tokenizer.batch_decode(output)
    print(output)
    
  3. Suggested Cloud GPUs: Consider using cloud services like AWS or Google Cloud for GPU resources.

License

Granite-3.1-2B-Base is released under the Apache 2.0 License. More details can be found here.

More Related APIs in Text Generation