Qwen Stock1 14 B LLM Model

Introduction

The QWENSTOCK1-14B model is a text generation model developed using a merge of multiple pre-trained language models. This model utilizes the mergekit tool and is compatible with the Transformers library.

Architecture

The architecture of QWENSTOCK1-14B involves merging several distinct models to create a comprehensive language model. The merging process uses the Model Stock merge method, which is detailed in a corresponding arXiv paper (arxiv: 2403.19522). The base model for the merge is sometimesanotion/Qwen2.5-14B-Vimarckoso-v3.

Training

The QWENSTOCK1-14B model is not trained from scratch but is a product of merging pre-trained models. The models involved in the merge include:

sthenno-com/miscii-14b-1225
djuna/Q2.5-Veltha-14B-0.5
CultriX/Qwen2.5-14B-Wernicke
CultriX/Qwen2.5-14B-MergeStock
allknowingroger/Qwenslerp6-14B
allknowingroger/Qwenslerp5-14B
hotmailuser/QwenSlerp-14B

The merging process was configured using YAML, with normalization enabled and the data type set to bfloat16.

Guide: Running Locally

To run the QWENSTOCK1-14B model locally, follow these basic steps:

Environment Setup: Ensure you have Python and pip installed. Set up a virtual environment for better dependency management.
Install Dependencies: Use pip to install the Transformers library and other required packages:
```
pip install transformers safetensors
```

Download Model: Use the Hugging Face transformers library to load the model:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "hotmailuser/QwenStock1-14B"
model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

Inference: Run a text generation inference using the model:

input_text = "Your prompt here"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

For optimal performance, it is recommended to use cloud GPUs available through services such as AWS, Google Cloud, or Azure.

License

The QWENSTOCK1-14B model is released under the Apache 2.0 License, allowing users to freely use, modify, and distribute the model with proper attribution.

More Related APIs in Text Generation