De B E R Ta v3 large mnli fever anli ling wanli

MoritzLaurer

Introduction

The DeBERTa-v3-large-mnli-fever-anli-ling-wanli model is a state-of-the-art natural language inference (NLI) model designed for zero-shot classification tasks. It is fine-tuned on several NLI datasets, showcasing superior performance, especially on adversarial benchmarks.

Architecture

The model is built upon Microsoft's DeBERTa-v3-large, which introduces innovations over classical masked language models such as BERT and RoBERTa. It leverages recent advancements for improved performance in understanding and classifying text.

Training

Training Data

The model was trained on datasets such as MultiNLI, Fever-NLI, ANLI, LingNLI, and WANLI, totaling 885,242 hypothesis-premise pairs. The SNLI dataset was excluded due to quality issues.

Training Procedure

The model was trained using the Hugging Face trainer with specific hyperparameters:

  • Epochs: 4
  • Learning Rate: 5e-06
  • Batch Size: 16 (train), 64 (eval)
  • Gradient Accumulation Steps: 2
  • Warmup Ratio: 0.06
  • Weight Decay: 0.01
  • Mixed Precision Training: Enabled (fp16)

Guide: Running Locally

Basic Steps

  1. Install Transformers Library:

    pip install transformers
    
  2. Load the Model:

    from transformers import pipeline
    classifier = pipeline("zero-shot-classification", model="MoritzLaurer/DeBERTa-v3-large-mnli-fever-anli-ling-wanli")
    
  3. Classify a Sequence:

    sequence_to_classify = "Angela Merkel is a politician in Germany and leader of the CDU"
    candidate_labels = ["politics", "economy", "entertainment", "environment"]
    output = classifier(sequence_to_classify, candidate_labels, multi_label=False)
    print(output)
    

Cloud GPUs

For enhanced performance, consider utilizing cloud GPU services such as AWS EC2 instances with NVIDIA GPUs or Google Cloud's AI Platform.

License

The model is available under the MIT License, allowing flexible usage and modification.

More Related APIs in Zero Shot Classification