Ahma 7 B Instruct LLM Model

Introduction

Ahma-7B-Instruct is a conversational model fine-tuned for instruction-following in Finnish. It is based on the Ahma-7B model, a decoder-only transformer using the Llama architecture, pre-trained from scratch on Finnish data.

Architecture

The model uses a transformer architecture with a context length of 2048, 32 layers, a dimension of 4096, and 32 heads. The base Ahma-7B model has 7 billion parameters. The instruct-tuned version is optimized for chat-like interactions.

Training

The model was initially fine-tuned using supervised fine-tuning (SFT) with a diverse dataset that included translated and synthetic data. It was further refined through Direct Preference Optimization (DPO) using datasets focused on instruction-following. Training involved techniques like Byte Pair Encoding for tokenization and Rank-Stabilized LoRA for parameter adaptation.

Guide: Running Locally

Dependencies: Install the transformers and torch libraries.
Model and Tokenizer: Use AutoTokenizer and AutoModelForCausalLM from the transformers library.
Prompt Setup: Format prompts using the built-in chat template feature.
Execution: Utilize a CUDA-enabled GPU for optimal performance. Consider using cloud GPU services like AWS, Google Cloud, or Azure for more efficient processing.

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

tokenizer = AutoTokenizer.from_pretrained("Finnish-NLP/Ahma-7B-Instruct")
model = AutoModelForCausalLM.from_pretrained("Finnish-NLP/Ahma-7B-Instruct", torch_dtype=torch.bfloat16, device_map="auto")

messages = [{"role": "system", "content": "Olet tekoälyavustaja..."}]
inputs = tokenizer.apply_chat_template(messages, tokenize=True, return_tensors="pt").to("cuda")

generated_ids = model.generate(inputs, temperature=0.6, do_sample=True, min_length=5, max_length=2048)
generated_text = tokenizer.batch_decode(generated_ids, skip_special_tokens=False, clean_up_tokenization_spaces=True)[0]
print(generated_text)

License

The model is distributed under the Apache 2.0 license, allowing for modification and distribution under the same terms.

More Related APIs in Text Generation