Teuken 7 B instruct commercial v0.4 G G U F LLM Model

Introduction

The Teuken-7B-instruct-commercial-v0.4-GGUF is a quantized version of the multilingual large language model (LLM) Teuken-7B, specifically designed for instruction tuning. It supports 24 European languages and is developed under the OpenGPT-X project with contributions from various institutions.

Architecture

The model is a transformer-based, decoder-only architecture with the following specifications:

Parameters: 7B
Sequence Length: 4096
Layers: 32
Hidden Size: 4096
Feedforward Network Size: 13440
Attention Heads: 32
Position Embeddings: Rotary
Normalization: RMSNorm

Training

The base model, Teuken-7B-base-v0.4, was pre-trained on 4 trillion tokens from publicly available sources up to September 2023. It was further instruction-tuned using English and German datasets, along with translations in 22 other European languages. Training utilized bf16 mixed precision on the JUWELS Booster infrastructure with NVIDIA A100 GPUs.

Guide: Running Locally

To run the model locally, follow these steps:

Install Required Libraries:
- transformers
- sentencepiece
- torch

Load Model and Tokenizer:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model_name = "openGPT-X/Teuken-7B-instruct-commercial-v0.4"
model = AutoModelForCausalLM.from_pretrained(model_name, trust_remote_code=True, torch_dtype=torch.bfloat16).to(device).eval()
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=False, trust_remote_code=True)

Generate Text:

messages = [{"role": "User", "content": "Wer bist du?"}]
prompt_ids = tokenizer.apply_chat_template(messages, chat_template="DE", tokenize=True, add_generation_prompt=True, return_tensors="pt")
prediction = model.generate(prompt_ids.to(model.device), max_length=512, do_sample=True, top_k=50, top_p=0.95, temperature=0.7, num_return_sequences=1)
print(tokenizer.decode(prediction[0].tolist()))

For optimal performance, using cloud GPUs like NVIDIA A100 is recommended.

License

The Teuken-7B-instruct-commercial-v0.4-GGUF model is released under the Apache 2.0 license, which allows for both commercial and non-commercial use.

More Related APIs in Text Generation