Qwen Slerp 14 B LLM Model

Introduction

QwenSlerp-14B is a merged language model designed for text generation tasks. The model is built using the SLERP merge method, combining two distinct pre-trained models to enhance performance.

Architecture

The model is developed using the transformers library, integrating features from the following base models:

sthenno-com/miscii-14b-1225
sometimesanotion/Qwen2.5-14B-Vimarckoso-v3

The architecture utilizes a bfloat16 data type for efficient processing.

Training

QwenSlerp-14B is not trained from scratch but is a result of merging two existing models using the SLERP method. This approach allows the model to inherit the strengths of both base models.

Guide: Running Locally

Clone the Repository:

git clone https://huggingface.co/hotmailuser/QwenSlerp-14B
cd QwenSlerp-14B

Install Dependencies:
```
pip install transformers safetensors
```

Load the Model:

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("hotmailuser/QwenSlerp-14B")
tokenizer = AutoTokenizer.from_pretrained("hotmailuser/QwenSlerp-14B")

Run Inference:

input_text = "Your input text here"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Cloud GPUs: For optimal performance, consider utilizing cloud GPU services such as AWS EC2 with NVIDIA GPUs, Google Cloud Platform, or Azure.

License

QwenSlerp-14B is distributed under the Apache 2.0 License, allowing for both personal and commercial use with proper attribution.

More Related APIs in Text Generation