De B E R Ta v3 base mnli fever anli

MoritzLaurer

Introduction

DeBERTa-v3-base-mnli-fever-anli is a pre-trained language model designed for natural language inference (NLI) tasks. It builds on Microsoft's DeBERTa-v3 architecture and has been fine-tuned on datasets like MultiNLI, Fever-NLI, and Adversarial-NLI (ANLI). This model excels at zero-shot classification and outperforms many larger models on the ANLI benchmark.

Architecture

The model is based on Microsoft's DeBERTa-v3 architecture, which includes improvements over previous versions by incorporating a different pre-training objective. This results in enhanced performance on various NLI tasks.

Training

The model was trained using the Hugging Face Trainer with specific hyperparameters such as a learning rate of 2e-05, a batch size of 32, and a warmup ratio of 0.1. It was trained over three epochs on a dataset consisting of 763,913 NLI hypothesis-premise pairs from MultiNLI, Fever-NLI, and ANLI datasets.

Guide: Running Locally

  1. Install Dependencies:
    Use the command pip install transformers[sentencepiece] to install the necessary packages.

  2. Load Model and Tokenizer:

    from transformers import pipeline
    classifier = pipeline("zero-shot-classification", model="MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli")
    
  3. Perform Inference:

    sequence_to_classify = "Angela Merkel is a politician in Germany and leader of the CDU"
    candidate_labels = ["politics", "economy", "entertainment", "environment"]
    output = classifier(sequence_to_classify, candidate_labels, multi_label=False)
    print(output)
    
  4. Cloud GPUs Recommended:
    For faster processing, consider using cloud-based GPUs such as those offered by AWS, Google Cloud, or Azure.

License

The model is released under the MIT license, allowing for wide usage and modification in both commercial and non-commercial applications.

More Related APIs in Zero Shot Classification