distilbert base uncased mnli
typeformDistilBERT Base Uncased MNLI
Introduction
The DistilBERT Base Uncased MNLI model is a zero-shot classification model fine-tuned on the Multi-Genre Natural Language Inference (MNLI) dataset. Developed by the Typeform team, it processes English text and is built upon the DistilBERT base model. The license for this model is currently unknown.
Architecture
This model uses the DistilBERT architecture, a lighter and faster version of BERT, optimized for performance efficiency. It is designed for zero-shot classification tasks, meaning it can classify text into categories it hasn't seen during training.
Training
The model is pretrained on the MNLI dataset, comprising 433k sentence pairs across various genres. Training was conducted on an AWS EC2 p3.2xlarge instance with the following hyperparameters:
- Max sequence length: 128
- Batch size: 16
- Learning rate: 2e-5
- Number of epochs: 5
Evaluation results indicate an accuracy of approximately 82.0% on the MNLI dataset.
Guide: Running Locally
To use the model locally, follow these steps:
-
Install Transformers Library
pip install transformers
-
Load the Model and Tokenizer
from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("typeform/distilbert-base-uncased-mnli") model = AutoModelForSequenceClassification.from_pretrained("typeform/distilbert-base-uncased-mnli")
-
Inference
Use the tokenizer and model to process text inputs for classification tasks.
Suggested Cloud GPUs
For optimal performance, consider using cloud GPUs such as NVIDIA Tesla V100 on platforms like AWS, Google Cloud, or Azure.
License
The licensing details for the DistilBERT Base Uncased MNLI model are not specified. Users should verify licensing terms before deployment in production environments.