U T E N A 7 B N S F W V2

AI-B

Introduction

UTENA-7B-NSFW-V2 is a text-generation model developed through the merge of pre-trained language models using the MergeKit tool. It is designed to handle various text generation tasks, evaluated on multiple datasets, and is part of the Open LLM Leaderboard.

Architecture

The model is a blend of two parent models, AI-B/UTENA-7B-NSFW and AI-B/UTENA-7B-BAGEL, using the SLERP merge method. The configuration involves slicing layers from each model and applying specific parameters and filters. The model operates in bfloat16 precision.

Training

The model was evaluated across several datasets with different few-shot settings:

  • AI2 Reasoning Challenge (25-Shot): Achieved a normalized accuracy of 63.31.
  • HellaSwag (10-Shot): Achieved a normalized accuracy of 84.54.
  • MMLU (5-Shot): Achieved an accuracy of 63.97.
  • TruthfulQA (0-shot): Recorded a metric value of 47.81.
  • Winogrande (5-shot): Achieved an accuracy of 78.69.
  • GSM8k (5-shot): Achieved an accuracy of 42.38.

Guide: Running Locally

  1. Clone the Repository:

    git clone https://huggingface.co/AI-B/UTENA-7B-NSFW-V2
    cd UTENA-7B-NSFW-V2
    
  2. Install Dependencies:
    Ensure you have Python and necessary libraries like transformers installed:

    pip install transformers
    
  3. Load the Model:
    Use the following Python code to load and interact with the model:

    from transformers import AutoModelForCausalLM, AutoTokenizer
    
    tokenizer = AutoTokenizer.from_pretrained("AI-B/UTENA-7B-NSFW-V2")
    model = AutoModelForCausalLM.from_pretrained("AI-B/UTENA-7B-NSFW-V2")
    
    input_text = "Your input text here"
    inputs = tokenizer(input_text, return_tensors="pt")
    outputs = model.generate(**inputs)
    print(tokenizer.decode(outputs[0]))
    
  4. Consider Cloud GPUs:
    For efficient computation, consider using cloud GPU services like AWS EC2, Google Cloud Platform, or Azure.

License

The model is released under the Unlicense, allowing for free use, modification, and distribution of the model and associated components.

More Related APIs in Text Generation