Captain Eris Diogenes_ Twilight V0.420 12 B

Nitral-AI

Introduction

Captain-Eris-Diogenes_Twilight-V0.420-12B is a merged model available on Hugging Face, designed for text generation tasks. This model integrates capabilities from two base models to enhance its performance and features within the text generation domain.

Architecture

The model is constructed using a combination of two base models:

  • Nitral-AI/Captain-Eris_Twilight-V0.420-12B
  • Nitral-AI/Diogenes-12B-ChatMLified

The architecture utilizes a "slerp" (spherical linear interpolation) merge method across specific layer ranges from both models. The configuration includes specialized parameters for filters such as self_attn and mlp with specified values, optimized with a bfloat16 data type.

Training

The model is trained by merging slices from the base models, focusing on layers 0 through 40 of each. The parameters are fine-tuned for attention mechanisms and multi-layer perceptrons (MLPs), with specified values to adjust the weight distribution across the merged layers.

Guide: Running Locally

To run the model locally, follow these steps:

  1. Install Required Libraries: Ensure you have the transformers library installed.

    pip install transformers
    
  2. Download the Model: Clone or download the model files from the Hugging Face repository.

  3. Load the Model: Use the transformers library to load the model.

    from transformers import AutoModelForCausalLM, AutoTokenizer
    
    model_name = "Nitral-AI/Captain-Eris-Diogenes_Twilight-V0.420-12B"
    model = AutoModelForCausalLM.from_pretrained(model_name)
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    
  4. Generate Text: Use the model to perform text generation tasks.

    input_text = "Once upon a time"
    inputs = tokenizer(input_text, return_tensors="pt")
    outputs = model.generate(**inputs)
    print(tokenizer.decode(outputs[0]))
    

For optimal performance, consider using cloud GPUs such as AWS EC2, Google Cloud, or Azure, which provide the necessary computational power for large model inference.

License

The model is available for use under the terms specified by its creators on the Hugging Face platform. Users should review the repository for specific licensing information and adhere to any usage guidelines provided.

More Related APIs in Text Generation