distilbert base uncased finetuned sst 2 english

distilbert

Introduction

DistilBERT is a smaller, faster, and cheaper version of BERT developed by Hugging Face. The model, distilbert-base-uncased-finetuned-sst-2-english, is fine-tuned on the Stanford Sentiment Treebank (SST-2) dataset for text classification tasks. It targets English language text classification with a focus on sentiment analysis.

Architecture

The model is based on DistilBERT, a simplified version of BERT that retains 97% of its language understanding while being 60% faster and 40% smaller. It is specifically designed for efficient text classification and supports a variety of machine learning frameworks, including PyTorch and TensorFlow.

Training

The training data used for this model is the Stanford Sentiment Treebank (SST-2). Key hyperparameters during fine-tuning include:

  • Learning rate: 1e-5
  • Batch size: 32
  • Warmup steps: 600
  • Maximum sequence length: 128
  • Number of training epochs: 3

The model achieves high accuracy and precision metrics on the dataset, making it suitable for sentiment analysis tasks.

Guide: Running Locally

To run the model locally, follow these steps:

  1. Install the Transformers library from Hugging Face using pip:

    pip install transformers torch
    
  2. Load the model and tokenizer in your Python script:

    import torch
    from transformers import DistilBertTokenizer, DistilBertForSequenceClassification
    
    tokenizer = DistilBertTokenizer.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")
    model = DistilBertForSequenceClassification.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")
    
  3. Prepare your input text and perform inference:

    inputs = tokenizer("Hello, my dog is cute", return_tensors="pt")
    with torch.no_grad():
        logits = model(**inputs).logits
    
    predicted_class_id = logits.argmax().item()
    print(model.config.id2label[predicted_class_id])
    

For better performance, especially on larger datasets, consider using cloud GPU services such as AWS, Google Cloud, or Azure.

License

The model is licensed under the Apache-2.0 License, which allows for both commercial and non-commercial use, modification, and distribution.

More Related APIs in Text Classification