Qwen Slerp 14 B
hotmailuserIntroduction
QwenSlerp-14B is a merged language model designed for text generation tasks. The model is built using the SLERP merge method, combining two distinct pre-trained models to enhance performance.
Architecture
The model is developed using the transformers
library, integrating features from the following base models:
sthenno-com/miscii-14b-1225
sometimesanotion/Qwen2.5-14B-Vimarckoso-v3
The architecture utilizes a bfloat16 data type for efficient processing.
Training
QwenSlerp-14B is not trained from scratch but is a result of merging two existing models using the SLERP method. This approach allows the model to inherit the strengths of both base models.
Guide: Running Locally
- Clone the Repository:
git clone https://huggingface.co/hotmailuser/QwenSlerp-14B cd QwenSlerp-14B
- Install Dependencies:
pip install transformers safetensors
- Load the Model:
from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained("hotmailuser/QwenSlerp-14B") tokenizer = AutoTokenizer.from_pretrained("hotmailuser/QwenSlerp-14B")
- Run Inference:
input_text = "Your input text here" inputs = tokenizer(input_text, return_tensors="pt") outputs = model.generate(**inputs) print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Cloud GPUs: For optimal performance, consider utilizing cloud GPU services such as AWS EC2 with NVIDIA GPUs, Google Cloud Platform, or Azure.
License
QwenSlerp-14B is distributed under the Apache 2.0 License, allowing for both personal and commercial use with proper attribution.