Qwen Stock1 14 B
hotmailuserIntroduction
The QWENSTOCK1-14B model is a text generation model developed using a merge of multiple pre-trained language models. This model utilizes the mergekit
tool and is compatible with the Transformers library.
Architecture
The architecture of QWENSTOCK1-14B involves merging several distinct models to create a comprehensive language model. The merging process uses the Model Stock merge method, which is detailed in a corresponding arXiv paper (arxiv: 2403.19522). The base model for the merge is sometimesanotion/Qwen2.5-14B-Vimarckoso-v3
.
Training
The QWENSTOCK1-14B model is not trained from scratch but is a product of merging pre-trained models. The models involved in the merge include:
sthenno-com/miscii-14b-1225
djuna/Q2.5-Veltha-14B-0.5
CultriX/Qwen2.5-14B-Wernicke
CultriX/Qwen2.5-14B-MergeStock
allknowingroger/Qwenslerp6-14B
allknowingroger/Qwenslerp5-14B
hotmailuser/QwenSlerp-14B
The merging process was configured using YAML, with normalization enabled and the data type set to bfloat16
.
Guide: Running Locally
To run the QWENSTOCK1-14B model locally, follow these basic steps:
- Environment Setup: Ensure you have Python and pip installed. Set up a virtual environment for better dependency management.
- Install Dependencies: Use pip to install the Transformers library and other required packages:
pip install transformers safetensors
- Download Model: Use the Hugging Face
transformers
library to load the model:from transformers import AutoModelForCausalLM, AutoTokenizer model_name = "hotmailuser/QwenStock1-14B" model = AutoModelForCausalLM.from_pretrained(model_name) tokenizer = AutoTokenizer.from_pretrained(model_name)
- Inference: Run a text generation inference using the model:
input_text = "Your prompt here" inputs = tokenizer(input_text, return_tensors="pt") outputs = model.generate(**inputs) print(tokenizer.decode(outputs[0], skip_special_tokens=True))
For optimal performance, it is recommended to use cloud GPUs available through services such as AWS, Google Cloud, or Azure.
License
The QWENSTOCK1-14B model is released under the Apache 2.0 License, allowing users to freely use, modify, and distribute the model with proper attribution.