Teuken 7 B instruct commercial v0.4 G G U F
QuantFactoryIntroduction
The Teuken-7B-instruct-commercial-v0.4-GGUF is a quantized version of the multilingual large language model (LLM) Teuken-7B, specifically designed for instruction tuning. It supports 24 European languages and is developed under the OpenGPT-X project with contributions from various institutions.
Architecture
The model is a transformer-based, decoder-only architecture with the following specifications:
- Parameters: 7B
- Sequence Length: 4096
- Layers: 32
- Hidden Size: 4096
- Feedforward Network Size: 13440
- Attention Heads: 32
- Position Embeddings: Rotary
- Normalization: RMSNorm
Training
The base model, Teuken-7B-base-v0.4, was pre-trained on 4 trillion tokens from publicly available sources up to September 2023. It was further instruction-tuned using English and German datasets, along with translations in 22 other European languages. Training utilized bf16 mixed precision on the JUWELS Booster infrastructure with NVIDIA A100 GPUs.
Guide: Running Locally
To run the model locally, follow these steps:
-
Install Required Libraries:
transformers
sentencepiece
torch
-
Load Model and Tokenizer:
import torch from transformers import AutoModelForCausalLM, AutoTokenizer device = torch.device("cuda" if torch.cuda.is_available() else "cpu") model_name = "openGPT-X/Teuken-7B-instruct-commercial-v0.4" model = AutoModelForCausalLM.from_pretrained(model_name, trust_remote_code=True, torch_dtype=torch.bfloat16).to(device).eval() tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=False, trust_remote_code=True)
-
Generate Text:
messages = [{"role": "User", "content": "Wer bist du?"}] prompt_ids = tokenizer.apply_chat_template(messages, chat_template="DE", tokenize=True, add_generation_prompt=True, return_tensors="pt") prediction = model.generate(prompt_ids.to(model.device), max_length=512, do_sample=True, top_k=50, top_p=0.95, temperature=0.7, num_return_sequences=1) print(tokenizer.decode(prediction[0].tolist()))
For optimal performance, using cloud GPUs like NVIDIA A100 is recommended.
License
The Teuken-7B-instruct-commercial-v0.4-GGUF model is released under the Apache 2.0 license, which allows for both commercial and non-commercial use.