Introduction

Sarvam-1 is a 2-billion parameter language model optimized for Indian languages. It excels in 10 Indic languages and provides competitive performance compared to larger models. It is designed for text completion and finetuning on specific tasks.

Architecture

  • Hidden Size: 2048
  • Intermediate Size: 11,008
  • Attention Heads: 16
  • Hidden Layers: 28
  • Key-Value Heads: 8
  • Max Position Embeddings: 8,192
  • Activation Function: SwiGLU
  • Positional Embeddings: Rotary (RoPE) with theta=10,000
  • Training: Grouped-query attention and bfloat16 mixed-precision

Training

  • Infrastructure: Yotta's Shakti cluster
  • Hardware: 1,024 GPUs
  • Duration: 5 days
  • Framework: NVIDIA NeMo

Guide: Running Locally

  1. Install Transformers Library:

    pip install transformers
    
  2. Load Model and Tokenizer:

    from transformers import AutoModelForCausalLM, AutoTokenizer
    
    model = AutoModelForCausalLM.from_pretrained("sarvamai/sarvam-1")
    tokenizer = AutoTokenizer.from_pretrained("sarvamai/sarvam-1")
    
  3. Generate Text:

    text = "कर्नाटक की राजधानी है:"
    inputs = tokenizer(text, return_tensors="pt")
    outputs = model.generate(**inputs, max_new_tokens=5)
    result = tokenizer.decode(outputs[0])
    

Cloud GPU Suggestion

For optimal performance, consider using cloud services like AWS, Azure, or GCP with GPU instances.

License

Sarvam is released under a non-commercial license. For more details, refer to the LICENSE file.

More Related APIs in Text Generation