miscii 14b 1225

sthenno-com

Introduction

MISCII-14B-1225 is a text-generation model designed for various language tasks, supporting both English and Chinese. It utilizes the transformers library and employs the mergekit tool for model merging. The model is licensed under Apache 2.0.

Architecture

The model architecture is a result of merging several pre-trained language models using the TIES method. The base model is sthenno-com/miscii-14b-1028. The configuration involves parameters such as bfloat16 data type, normalization, and specific weights and densities for each contributing model. The primary models contributing to this merge include sthenno/exp-002 and sthenno/miscii-1218.

Training

Details on the specific training process for MISCII-14B-1225 are forthcoming. However, the model was created using a merging technique rather than traditional training from scratch, implying that it combines pre-trained models to enhance performance.

Guide: Running Locally

  1. Installation: Start by installing the Hugging Face Transformers library:

    pip install transformers
    
  2. Model Download: Download the MISCII-14B-1225 model from Hugging Face Model Hub:

    from transformers import AutoModelForCausalLM, AutoTokenizer
    
    tokenizer = AutoTokenizer.from_pretrained("sthenno-com/miscii-14b-1225")
    model = AutoModelForCausalLM.from_pretrained("sthenno-com/miscii-14b-1225")
    
  3. Inference: Use the model for text generation:

    inputs = tokenizer("Hello, how are you?", return_tensors="pt")
    outputs = model.generate(inputs["input_ids"], max_length=50)
    print(tokenizer.decode(outputs[0], skip_special_tokens=True))
    
  4. Hardware Recommendation: Consider using cloud GPU services such as AWS EC2, Google Cloud Platform, or Azure for optimal performance, especially for resource-intensive tasks.

License

The MISCII-14B-1225 model is released under the Apache 2.0 license, allowing for both personal and commercial use with attribution.

More Related APIs in Text Generation