Open Hermes 2.5 Mistral 7 B

teknium

Introduction

OpenHermes 2.5 Mistral 7B is an advanced language model fine-tuned on the Mistral architecture, designed to enhance conversational and text generation capabilities. It builds upon previous versions by incorporating additional datasets, including code, which has significantly improved its performance on various benchmarks.

Architecture

OpenHermes 2.5 is based on the Mistral-7B architecture. It utilizes a fine-tuning process that incorporates code instructions and other high-quality datasets, resulting in enhanced performance for non-code benchmarks such as TruthfulQA and AGIEval, while also improving code-related performance metrics like the humaneval score.

Training

The model was trained on 1,000,000 entries, primarily generated by GPT-4, along with other public datasets. Extensive filtering and format conversion were applied to align with the ShareGPT and ChatML standards. The training emphasized a balance between code-specific and generalist improvements, ensuring a broad applicability across tasks.

Guide: Running Locally

  1. Setup Environment: Ensure you have Python and PyTorch installed.
  2. Install Transformers: Use pip install transformers to install necessary libraries.
  3. Download Model: Access the model from Hugging Face's model hub: OpenHermes-2.5-Mistral-7B.
  4. Load Model: Utilize the Transformers library to load the model with:
    from transformers import AutoModelForCausalLM, AutoTokenizer
    
    tokenizer = AutoTokenizer.from_pretrained("teknium/OpenHermes-2.5-Mistral-7B")
    model = AutoModelForCausalLM.from_pretrained("teknium/OpenHermes-2.5-Mistral-7B")
    
  5. Run Inference: Use the tokenizer and model to generate responses.
  6. GPU Recommendation: For optimal performance, consider using a cloud GPU service such as AWS, Google Cloud, or Azure.

License

OpenHermes 2.5 Mistral 7B is licensed under the Apache 2.0 License, permitting use, modification, and distribution.

More Related APIs in Text Generation