Nous Hermes Llama2 13b
NousResearchIntroduction
Nous-Hermes-Llama2-13b is an advanced language model fine-tuned on over 300,000 instructions. Developed by Nous Research with contributions from Teknium and Emozilla, and compute sponsored by Redmond AI, this model is designed to offer long, coherent responses with a reduced hallucination rate and no OpenAI censorship mechanisms.
Architecture
The model retains consistency with its predecessor, Hermes on Llama-1, by using the same datasets. It was fine-tuned with a 4096 sequence length using an 8x A100 80GB DGX machine, ensuring enhanced capabilities while maintaining familiar behavior.
Training
Training involved predominantly synthetic GPT-4 outputs, enhancing the model's knowledge, task completion, and style. The datasets were curated from diverse sources, including GPTeacher, roleplay datasets, and others. Collaborators like Teknium, Karan4D, and Microsoft contributed significantly to the dataset creation and fine-tuning process.
Guide: Running Locally
-
Setup Environment: Ensure you have Python and PyTorch installed. Install the Hugging Face Transformers library.
pip install transformers
-
Download Model: Access the model from the Hugging Face repository.
from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained("NousResearch/Nous-Hermes-Llama2-13b") tokenizer = AutoTokenizer.from_pretrained("NousResearch/Nous-Hermes-Llama2-13b")
-
Run Inference: Use the model to generate text.
inputs = tokenizer("Your input prompt here", return_tensors="pt") outputs = model.generate(**inputs) print(tokenizer.decode(outputs[0]))
-
Consider Cloud GPUs: Given the model's size, using cloud GPUs like AWS, Google Cloud, or Azure can enhance performance and efficiency.
License
The model is provided under the MIT License, allowing for wide usage and modification with minimal restrictions.