Open Hermes 2.5 Mistral 7 B
tekniumIntroduction
OpenHermes 2.5 Mistral 7B is an advanced language model fine-tuned on the Mistral architecture, designed to enhance conversational and text generation capabilities. It builds upon previous versions by incorporating additional datasets, including code, which has significantly improved its performance on various benchmarks.
Architecture
OpenHermes 2.5 is based on the Mistral-7B architecture. It utilizes a fine-tuning process that incorporates code instructions and other high-quality datasets, resulting in enhanced performance for non-code benchmarks such as TruthfulQA and AGIEval, while also improving code-related performance metrics like the humaneval score.
Training
The model was trained on 1,000,000 entries, primarily generated by GPT-4, along with other public datasets. Extensive filtering and format conversion were applied to align with the ShareGPT and ChatML standards. The training emphasized a balance between code-specific and generalist improvements, ensuring a broad applicability across tasks.
Guide: Running Locally
- Setup Environment: Ensure you have Python and PyTorch installed.
- Install Transformers: Use
pip install transformers
to install necessary libraries. - Download Model: Access the model from Hugging Face's model hub: OpenHermes-2.5-Mistral-7B.
- Load Model: Utilize the Transformers library to load the model with:
from transformers import AutoModelForCausalLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("teknium/OpenHermes-2.5-Mistral-7B") model = AutoModelForCausalLM.from_pretrained("teknium/OpenHermes-2.5-Mistral-7B")
- Run Inference: Use the tokenizer and model to generate responses.
- GPU Recommendation: For optimal performance, consider using a cloud GPU service such as AWS, Google Cloud, or Azure.
License
OpenHermes 2.5 Mistral 7B is licensed under the Apache 2.0 License, permitting use, modification, and distribution.