Nemotron 4 Mini Hindi 4 B Instruct

nvidia

Introduction

Nemotron-4-Mini-Hindi-4B-Instruct is a small language model developed by NVIDIA, designed to generate responses to queries grounded in the Indian context, supporting Hindi, English, and Hinglish. It is an aligned version of the Nemotron-4-Mini-Hindi-4B-Base model. The model is available for commercial use and supports a context length of 4,096 tokens. For more detailed information, refer to the arXiv paper.

Architecture

The model uses a transformer decoder architecture with an embedding size of 3072, 32 attention heads, and an MLP intermediate dimension of 9216. It implements Grouped-Query Attention (GQA) and Rotary Position Embeddings (RoPE).

Training

Nemotron-4-Mini-Hindi-4B-Instruct is fine-tuned from the Nemotron-4-Mini-Hindi-4B-Base model using a mix of real and synthetic alignment corpus. It underwent extensive safety evaluation and was trained on a dataset that may contain biases, which can affect its outputs.

Guide: Running Locally

  1. Environment Setup: Ensure you have Python and the Transformers library installed.
  2. Load the Model:
    from transformers import AutoTokenizer, AutoModelForCausalLM
    
    tokenizer = AutoTokenizer.from_pretrained("nvidia/Nemotron-4-Mini-Hindi-4B-Instruct")
    model = AutoModelForCausalLM.from_pretrained("nvidia/Nemotron-4-Mini-Hindi-4B-Instruct")
    
  3. Generate Text:
    messages = [{"role": "user", "content": "भारत की संस्कृति के बारे में बताएं।"}]
    tokenized_chat = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
    outputs = model.generate(tokenized_chat, max_new_tokens=128)
    print(tokenizer.decode(outputs[0]))
    
  4. Using a Pipeline:
    from transformers import pipeline
    
    pipe = pipeline("text-generation", model="nvidia/Nemotron-4-Mini-Hindi-4B-Instruct", max_new_tokens=128)
    pipe.tokenizer = tokenizer
    pipe(messages)
    
  5. Suggested Cloud GPUs: Consider using NVIDIA A100 for optimal performance.

License

Nemotron-4-Mini-Hindi-4B-Instruct is released under the NVIDIA Open Model License Agreement.

More Related APIs