Night Wing3 10 B v0.1

Nitral-AI

Introduction

NightWing3-10B-v0.1 is a model developed by Nitral-AI, designed for text generation tasks. It is part of the Hugging Face ecosystem and leverages the Transformers library. The model is built using a combination of different techniques to enhance performance and adaptability.

Architecture

NightWing3-10B is based on the FALCON3-10B architecture. It uses a SLERP merge method to integrate different model slices, which are derived from two main sources within the Nitral-Archive repository. The YAML configuration specifies the use of layers and parameters from two training iterations of the NightWing3 model.

Training

The model is trained using a SLERP (Spherical Linear Interpolation) merge method, which involves:

  • Combining layers from two model sources: nightwing3-r64-1-latest_test-train-10B and nightwing3-r64-2-latest_test-train-10B.
  • Applying specific parameter filters for self-attention and multi-layer perceptrons (MLP).

The process employs bfloat16 data type for computational efficiency.

Guide: Running Locally

To run NightWing3-10B-v0.1 locally:

  1. Install Dependencies: Ensure you have Python and the transformers library installed.

    pip install transformers
    
  2. Download the Model: Fetch the model files from the Hugging Face model repository.

  3. Load the Model: Utilize the Transformers library to load the model in your script.

    from transformers import AutoModelForCausalLM, AutoTokenizer
    
    tokenizer = AutoTokenizer.from_pretrained("Nitral-AI/NightWing3-10B-v0.1")
    model = AutoModelForCausalLM.from_pretrained("Nitral-AI/NightWing3-10B-v0.1")
    
  4. Run Inference: Use the loaded model for text generation tasks.

Consider using cloud GPU services like AWS, GCP, or Azure for optimal performance, as local hardware may not suffice for large models.

License

This model is released under an "other" license. For specific terms and conditions, refer to the Hugging Face model card or contact Nitral-AI directly.

More Related APIs in Text Generation