hallucination_evaluation_model

vectara

Introduction

The Hallucination Evaluation Model (HHEM-2.1-Open) developed by Vectara is designed to detect hallucinations in large language models (LLMs). It performs significantly better than its predecessor, HHEM-1.0, as well as GPT-3.5-Turbo and GPT-4, making it particularly useful for retrieval-augmented-generation (RAG) applications.

Architecture

HHEM-2.1-Open is built upon Google's FLAN-T5-Base model and is configured for text classification tasks. It features an unlimited context length, unlike HHEM-1.0, which was capped at 512 tokens. HHEM-2.1-Open is optimized to run on consumer-grade hardware, requiring less than 600MB of RAM at 32-bit precision and processing a 2k-token input in approximately 1.5 seconds on a modern x86 CPU.

Training

HHEM-2.1-Open has been benchmarked against datasets like AggreFact and RAGTruth. It is trained to assess pairs of text (premise and hypothesis) to determine factual consistency, returning a score where 0 indicates no support and 1 indicates full support.

Guide: Running Locally

  1. Install Dependencies: Ensure you have the Transformers library installed.

    pip install transformers
    
  2. Load the Model: Use the AutoModelForSequenceClassification class.

    from transformers import AutoModelForSequenceClassification
    
    model = AutoModelForSequenceClassification.from_pretrained(
        'vectara/hallucination_evaluation_model', trust_remote_code=True)
    
  3. Prepare Data and Predict: Input pairs of premises and hypotheses to get scores.

    pairs = [
        ("The capital of France is Berlin.", "The capital of France is Paris."),
        ("I am in California", "I am in United States."),
    ]
    model.predict(pairs)
    
  4. Optional: Use Pipeline: For ease, use the pipeline class to automate input processing.

    from transformers import pipeline, AutoTokenizer
    
    classifier = pipeline(
        "text-classification",
        model='vectara/hallucination_evaluation_model',
        tokenizer=AutoTokenizer.from_pretrained('google/flan-t5-base'),
        trust_remote_code=True
    )
    

Cloud GPUs: For enhanced performance, especially with larger datasets, consider using cloud GPUs available from providers like AWS, Google Cloud, or Azure.

License

The HHEM-2.1-Open model is licensed under the Apache-2.0 License, allowing for wide use and modification in compliance with its terms.

More Related APIs in Text Classification