De B E R Ta v3 base mnli fever anli
MoritzLaurerIntroduction
DeBERTa-v3-base-mnli-fever-anli is a pre-trained language model designed for natural language inference (NLI) tasks. It builds on Microsoft's DeBERTa-v3 architecture and has been fine-tuned on datasets like MultiNLI, Fever-NLI, and Adversarial-NLI (ANLI). This model excels at zero-shot classification and outperforms many larger models on the ANLI benchmark.
Architecture
The model is based on Microsoft's DeBERTa-v3 architecture, which includes improvements over previous versions by incorporating a different pre-training objective. This results in enhanced performance on various NLI tasks.
Training
The model was trained using the Hugging Face Trainer with specific hyperparameters such as a learning rate of 2e-05, a batch size of 32, and a warmup ratio of 0.1. It was trained over three epochs on a dataset consisting of 763,913 NLI hypothesis-premise pairs from MultiNLI, Fever-NLI, and ANLI datasets.
Guide: Running Locally
-
Install Dependencies:
Use the commandpip install transformers[sentencepiece]
to install the necessary packages. -
Load Model and Tokenizer:
from transformers import pipeline classifier = pipeline("zero-shot-classification", model="MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli")
-
Perform Inference:
sequence_to_classify = "Angela Merkel is a politician in Germany and leader of the CDU" candidate_labels = ["politics", "economy", "entertainment", "environment"] output = classifier(sequence_to_classify, candidate_labels, multi_label=False) print(output)
-
Cloud GPUs Recommended:
For faster processing, consider using cloud-based GPUs such as those offered by AWS, Google Cloud, or Azure.
License
The model is released under the MIT license, allowing for wide usage and modification in both commercial and non-commercial applications.