Captain Eris_ Violet V0.420 12 B
Nitral-AIIntroduction
Captain-Eris_Violet-V0.420-12B is a text generation model developed by Nitral-AI. It leverages the transformers library and operates within the English language domain. The model is part of a set of merged models, utilizing advanced techniques to enhance its text generation capabilities.
Architecture
The model is a result of merging two base models: Epiculous/Violet_Twilight-v0.2
and Nitral-AI/Captain_BMO-12B
. The merging utilizes a technique called slerp
(spherical linear interpolation) across specified layers from each base model. The configuration employs a bfloat16 data type to optimize performance and efficiency.
Training
The model was configured using a YAML setup that defines layer ranges and merging parameters. The specific configuration includes layer slicing and filtering through self-attention and MLP (multi-layer perceptron) components to refine the model's output.
Guide: Running Locally
To run Captain-Eris_Violet-V0.420-12B locally, follow these steps:
-
Installation: Ensure you have Python and the necessary libraries installed, including
transformers
. -
Clone the Model: Download the model from the Hugging Face repository.
git clone https://huggingface.co/Nitral-AI/Captain-Eris_Violet-V0.420-12B
-
Load the Model: Use the transformers library to load the model for inference.
from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained("Nitral-AI/Captain-Eris_Violet-V0.420-12B") tokenizer = AutoTokenizer.from_pretrained("Nitral-AI/Captain-Eris_Violet-V0.420-12B")
-
Inference: Generate text using the loaded model.
input_text = "Once upon a time," inputs = tokenizer(input_text, return_tensors="pt") outputs = model.generate(**inputs) decoded_output = tokenizer.decode(outputs[0], skip_special_tokens=True) print(decoded_output)
For optimal performance, consider using cloud GPU services such as AWS, Google Cloud, or Azure, which provide powerful resources for handling large model computations.
License
The model is released under an unspecified license categorized as "other." Users should review the specific terms and conditions in the license file provided with the model.