Qwen2.5 14 B Vimarckoso v3
sometimesanotionIntroduction
Qwen2.5-14B-Vimarckoso-v3 is a sophisticated text generation model focused on enhancing reasoning capabilities while maintaining strong instruction-following abilities. It is part of the broader Lamarck project and has been developed using various advanced merging techniques to integrate strengths from multiple models.
Architecture
The model is built upon several base models, including Virtuoso-Small, Rombos-LLM-V2.6-Qwen-14b, and others, leveraging their unique capabilities. The architecture uses a combination of the model_stock
and slerp
merging methods to refine its performance. The model operates with a data type transformation from float32
to bfloat16
and employs int8_mask
, normalize
, and rescale
parameters to optimize its performance.
Training
The training process involved merging techniques that incorporate various models, including EVA-Qwen2.5 and Qwen2.5-Lumen-14B, to enhance reasoning and instruction-following. Specific configurations and parameter settings, such as the use of LoRAs from Abliterate-Qwenvergence, were applied to ensure the model's robustness and accuracy.
Guide: Running Locally
- Clone the Repository: Start by cloning the model repository from Hugging Face.
- Install Required Libraries: Ensure you have the
transformers
library and other dependencies installed.pip install transformers
- Load the Model: Use the
transformers
library to load the model.from transformers import AutoModelForCausalLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("sometimesanotion/Qwen2.5-14B-Vimarckoso-v3") model = AutoModelForCausalLM.from_pretrained("sometimesanotion/Qwen2.5-14B-Vimarckoso-v3")
- Run Inference: Use the model to generate text based on a given prompt.
inputs = tokenizer("Your prompt here", return_tensors="pt") outputs = model.generate(**inputs) print(tokenizer.decode(outputs[0]))
- Consider Cloud GPUs: Due to the model's size, utilizing cloud GPUs such as AWS EC2, Google Cloud, or Azure can significantly improve performance and reduce computation time.
License
This model is released under the Apache 2.0 license, allowing for both personal and commercial use with proper attribution.