DistilBERT Base Uncased MNLI

Introduction

The DistilBERT Base Uncased MNLI model is a zero-shot classification model fine-tuned on the Multi-Genre Natural Language Inference (MNLI) dataset. Developed by the Typeform team, it processes English text and is built upon the DistilBERT base model. The license for this model is currently unknown.

Architecture

This model uses the DistilBERT architecture, a lighter and faster version of BERT, optimized for performance efficiency. It is designed for zero-shot classification tasks, meaning it can classify text into categories it hasn't seen during training.

Training

The model is pretrained on the MNLI dataset, comprising 433k sentence pairs across various genres. Training was conducted on an AWS EC2 p3.2xlarge instance with the following hyperparameters:

Max sequence length: 128
Batch size: 16
Learning rate: 2e-5
Number of epochs: 5

Evaluation results indicate an accuracy of approximately 82.0% on the MNLI dataset.

Guide: Running Locally

To use the model locally, follow these steps:

Install Transformers Library
```
pip install transformers
```

Load the Model and Tokenizer

from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("typeform/distilbert-base-uncased-mnli")
model = AutoModelForSequenceClassification.from_pretrained("typeform/distilbert-base-uncased-mnli")

Inference
Use the tokenizer and model to process text inputs for classification tasks.

Suggested Cloud GPUs

For optimal performance, consider using cloud GPUs such as NVIDIA Tesla V100 on platforms like AWS, Google Cloud, or Azure.

License

The licensing details for the DistilBERT Base Uncased MNLI model are not specified. Users should verify licensing terms before deployment in production environments.

More Related APIs in Zero Shot Classification