Qwen Stock1 14 B

hotmailuser

Introduction

The QWENSTOCK1-14B model is a text generation model developed using a merge of multiple pre-trained language models. This model utilizes the mergekit tool and is compatible with the Transformers library.

Architecture

The architecture of QWENSTOCK1-14B involves merging several distinct models to create a comprehensive language model. The merging process uses the Model Stock merge method, which is detailed in a corresponding arXiv paper (arxiv: 2403.19522). The base model for the merge is sometimesanotion/Qwen2.5-14B-Vimarckoso-v3.

Training

The QWENSTOCK1-14B model is not trained from scratch but is a product of merging pre-trained models. The models involved in the merge include:

  • sthenno-com/miscii-14b-1225
  • djuna/Q2.5-Veltha-14B-0.5
  • CultriX/Qwen2.5-14B-Wernicke
  • CultriX/Qwen2.5-14B-MergeStock
  • allknowingroger/Qwenslerp6-14B
  • allknowingroger/Qwenslerp5-14B
  • hotmailuser/QwenSlerp-14B

The merging process was configured using YAML, with normalization enabled and the data type set to bfloat16.

Guide: Running Locally

To run the QWENSTOCK1-14B model locally, follow these basic steps:

  1. Environment Setup: Ensure you have Python and pip installed. Set up a virtual environment for better dependency management.
  2. Install Dependencies: Use pip to install the Transformers library and other required packages:
    pip install transformers safetensors
    
  3. Download Model: Use the Hugging Face transformers library to load the model:
    from transformers import AutoModelForCausalLM, AutoTokenizer
    
    model_name = "hotmailuser/QwenStock1-14B"
    model = AutoModelForCausalLM.from_pretrained(model_name)
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    
  4. Inference: Run a text generation inference using the model:
    input_text = "Your prompt here"
    inputs = tokenizer(input_text, return_tensors="pt")
    outputs = model.generate(**inputs)
    print(tokenizer.decode(outputs[0], skip_special_tokens=True))
    

For optimal performance, it is recommended to use cloud GPUs available through services such as AWS, Google Cloud, or Azure.

License

The QWENSTOCK1-14B model is released under the Apache 2.0 License, allowing users to freely use, modify, and distribute the model with proper attribution.

More Related APIs in Text Generation