S I L M A 9 B Instruct v1.0
silma-aiIntroduction
SILMA-9B-Instruct-v1.0 is a leading Arabic language model developed by SILMA AI, designed to empower Arabic speakers with advanced AI capabilities. It is a 9 billion parameter model that excels in various text-generation tasks, often outperforming larger models.
Architecture
The SILMA model is built on the robust foundational models of Google Gemma, combining their strengths to deliver high performance. It supports both Arabic and English languages and is optimized for conversational applications.
Training
The model has been evaluated using several Arabic benchmarks, achieving notable scores in tasks involving text generation. It has been trained and validated with datasets such as MMLU, AlGhafa, ARC Challenge, and others, with performance measured in terms of normalized accuracy (acc_norm).
Guide: Running Locally
To run the SILMA model locally, follow these basic steps:
-
Install Dependencies:
pip install -U transformers sentencepiece pip install accelerate bitsandbytes
-
Load and Run the Model Using PyTorch:
- Import necessary libraries and load the model:
from transformers import AutoTokenizer, AutoModelForCausalLM import torch model_id = "silma-ai/SILMA-9B-Instruct-v1.0" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", torch_dtype=torch.bfloat16) model.to("cuda")
- Import necessary libraries and load the model:
-
Inference:
- Use the model to generate text based on user input:
messages = [{"role": "user", "content": "Write a message..."}] input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt").to("cuda") outputs = model.generate(**input_ids, max_new_tokens=256) print(tokenizer.decode(outputs[0]))
- Use the model to generate text based on user input:
Suggested Cloud GPUs
- Recommended: Nvidia A40, L40, RTX A6000 (48 GB)
- Minimum: Nvidia RTX 4090, RTX 4000, L4 (16-24 GB)
These GPUs are suitable for running the model, especially in quantized modes (8-bit or 4-bit).
License
The model is open-weight and free to use under the Gemma license, which encourages sharing and innovation while adhering to responsible usage guidelines.