Ahma 7 B Instruct
Finnish-NLPIntroduction
Ahma-7B-Instruct is a conversational model fine-tuned for instruction-following in Finnish. It is based on the Ahma-7B model, a decoder-only transformer using the Llama architecture, pre-trained from scratch on Finnish data.
Architecture
The model uses a transformer architecture with a context length of 2048, 32 layers, a dimension of 4096, and 32 heads. The base Ahma-7B model has 7 billion parameters. The instruct-tuned version is optimized for chat-like interactions.
Training
The model was initially fine-tuned using supervised fine-tuning (SFT) with a diverse dataset that included translated and synthetic data. It was further refined through Direct Preference Optimization (DPO) using datasets focused on instruction-following. Training involved techniques like Byte Pair Encoding for tokenization and Rank-Stabilized LoRA for parameter adaptation.
Guide: Running Locally
- Dependencies: Install the
transformers
andtorch
libraries. - Model and Tokenizer: Use
AutoTokenizer
andAutoModelForCausalLM
from thetransformers
library. - Prompt Setup: Format prompts using the built-in chat template feature.
- Execution: Utilize a CUDA-enabled GPU for optimal performance. Consider using cloud GPU services like AWS, Google Cloud, or Azure for more efficient processing.
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
tokenizer = AutoTokenizer.from_pretrained("Finnish-NLP/Ahma-7B-Instruct")
model = AutoModelForCausalLM.from_pretrained("Finnish-NLP/Ahma-7B-Instruct", torch_dtype=torch.bfloat16, device_map="auto")
messages = [{"role": "system", "content": "Olet tekoälyavustaja..."}]
inputs = tokenizer.apply_chat_template(messages, tokenize=True, return_tensors="pt").to("cuda")
generated_ids = model.generate(inputs, temperature=0.6, do_sample=True, min_length=5, max_length=2048)
generated_text = tokenizer.batch_decode(generated_ids, skip_special_tokens=False, clean_up_tokenization_spaces=True)[0]
print(generated_text)
License
The model is distributed under the Apache 2.0 license, allowing for modification and distribution under the same terms.