Eurus R M 7b

openbmb

Introduction

Eurus-RM-7B is a reward model trained on datasets like UltraInteract, UltraFeedback, and UltraSafety, using a reward modeling objective designed to enhance reasoning capabilities. It is recognized for its superiority among 7 billion parameter reward models, achieving comparable or superior performance to larger models, including outperforming GPT-4 in specific tasks. The training objective is particularly effective in enhancing performance on complex reasoning tasks. Mixing datasets allows for balanced reward modeling capabilities, and Eurus-RM-7B significantly boosts reasoning performance through reranking.

Architecture

Eurus-RM-7B is designed with a focus on reward modeling to improve reasoning abilities. It leverages a combination of datasets and a specific training objective, allowing it to effectively tackle complex reasoning tasks and improve overall reward model performance.

Training

Eurus-RM-7B is trained using a blend of UltraInteract, UltraFeedback, and UltraSafety datasets. The training process focuses on a reward modeling objective that enhances reasoning capabilities, especially in challenging scenarios. This approach helps the model achieve significant performance improvements over other models, including larger ones like GPT-4, in certain tasks.

Guide: Running Locally

  1. Setup Environment: Ensure Python and PyTorch are installed. Install the transformers library from Hugging Face.

    pip install transformers torch
    
  2. Download Model: Use the following code to load the model and tokenizer.

    from transformers import AutoTokenizer, AutoModel
    
    tokenizer = AutoTokenizer.from_pretrained("openbmb/Eurus-RM-7b")
    model = AutoModel.from_pretrained("openbmb/Eurus-RM-7b", trust_remote_code=True)
    
  3. Inference: Use the model to evaluate text inputs, focusing on reasoning capabilities.

    import torch
    
    def test(model_path):
        dataset = [
            {
                "chosen": "[INST] Sural relates to which part of the body? [/INST] The sural region...",
                "rejected": "[INST] Sural relates to which part of the body? [/INST] The Sural nerve..."
            }
        ]
    
        tokenizer = AutoTokenizer.from_pretrained(model_path)
        model = AutoModel.from_pretrained(model_path, trust_remote_code=True)
    
        with torch.no_grad():
            for example in dataset:
                inputs = tokenizer(example["chosen"], return_tensors="pt")
                chosen_reward = model(**inputs).item()
                inputs = tokenizer(example["rejected"], return_tensors="pt")
                rejected_reward = model(**inputs).item()
                print(chosen_reward - rejected_reward)
    
    test("openbmb/Eurus-RM-7b")
    
  4. Cloud GPUs: For enhanced performance, consider using cloud GPU services like AWS, Google Cloud, or Azure.

License

Eurus-RM-7B is released under the Apache-2.0 License. This license permits use, distribution, and modification, provided that proper attribution is maintained.

More Related APIs in Text Classification