Night Wing3 10 B v0.1
Nitral-AIIntroduction
NightWing3-10B-v0.1 is a model developed by Nitral-AI, designed for text generation tasks. It is part of the Hugging Face ecosystem and leverages the Transformers library. The model is built using a combination of different techniques to enhance performance and adaptability.
Architecture
NightWing3-10B is based on the FALCON3-10B architecture. It uses a SLERP merge method to integrate different model slices, which are derived from two main sources within the Nitral-Archive repository. The YAML configuration specifies the use of layers and parameters from two training iterations of the NightWing3 model.
Training
The model is trained using a SLERP (Spherical Linear Interpolation) merge method, which involves:
- Combining layers from two model sources:
nightwing3-r64-1-latest_test-train-10B
andnightwing3-r64-2-latest_test-train-10B
. - Applying specific parameter filters for self-attention and multi-layer perceptrons (MLP).
The process employs bfloat16 data type for computational efficiency.
Guide: Running Locally
To run NightWing3-10B-v0.1 locally:
-
Install Dependencies: Ensure you have Python and the
transformers
library installed.pip install transformers
-
Download the Model: Fetch the model files from the Hugging Face model repository.
-
Load the Model: Utilize the Transformers library to load the model in your script.
from transformers import AutoModelForCausalLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("Nitral-AI/NightWing3-10B-v0.1") model = AutoModelForCausalLM.from_pretrained("Nitral-AI/NightWing3-10B-v0.1")
-
Run Inference: Use the loaded model for text generation tasks.
Consider using cloud GPU services like AWS, GCP, or Azure for optimal performance, as local hardware may not suffice for large models.
License
This model is released under an "other" license. For specific terms and conditions, refer to the Hugging Face model card or contact Nitral-AI directly.