nli deberta v3 base
cross-encoderIntroduction
The nli-deberta-v3-base
cross-encoder model is designed for Natural Language Inference (NLI) tasks, utilizing the microsoft/deberta-v3-base
architecture. It is capable of zero-shot classification, performing inference to determine relationships between sentence pairs, such as contradiction, entailment, and neutral.
Architecture
The model is built using the SentenceTransformers Cross-Encoder class, based on the microsoft/deberta-v3-base
. It is designed to handle tasks requiring text classification, particularly within the realm of zero-shot classification.
Training
The model was trained on the SNLI and MultiNLI datasets. It provides output scores corresponding to the labels: contradiction, entailment, and neutral. Performance metrics include:
- SNLI-test dataset accuracy: 92.38%
- MNLI mismatched set accuracy: 90.04%
Guide: Running Locally
Basic Steps
-
Using SentenceTransformers:
from sentence_transformers import CrossEncoder model = CrossEncoder('cross-encoder/nli-deberta-v3-base') scores = model.predict([('A man is eating pizza', 'A man eats something')])
-
Using Transformers AutoModel:
from transformers import AutoTokenizer, AutoModelForSequenceClassification import torch model = AutoModelForSequenceClassification.from_pretrained('cross-encoder/nli-deberta-v3-base') tokenizer = AutoTokenizer.from_pretrained('cross-encoder/nli-deberta-v3-base') features = tokenizer(['A man is eating pizza'], ['A man eats something'], padding=True, truncation=True, return_tensors="pt") model.eval() with torch.no_grad(): scores = model(**features).logits
-
Zero-Shot Classification:
from transformers import pipeline classifier = pipeline("zero-shot-classification", model='cross-encoder/nli-deberta-v3-base') sent = "Apple just announced the newest iPhone X" candidate_labels = ["technology", "sports", "politics"] res = classifier(sent, candidate_labels)
Cloud GPUs
For efficient model training and inference, consider using cloud-based GPU services such as AWS EC2 with GPU instances, Google Cloud Platform (GCP) with TPU support, or NVIDIA's GPU Cloud.
License
The model is released under the Apache-2.0 license, allowing for both personal and commercial use.