distilbert base uncased finetuned sst 2 english
distilbertIntroduction
DistilBERT is a smaller, faster, and cheaper version of BERT developed by Hugging Face. The model, distilbert-base-uncased-finetuned-sst-2-english
, is fine-tuned on the Stanford Sentiment Treebank (SST-2) dataset for text classification tasks. It targets English language text classification with a focus on sentiment analysis.
Architecture
The model is based on DistilBERT, a simplified version of BERT that retains 97% of its language understanding while being 60% faster and 40% smaller. It is specifically designed for efficient text classification and supports a variety of machine learning frameworks, including PyTorch and TensorFlow.
Training
The training data used for this model is the Stanford Sentiment Treebank (SST-2). Key hyperparameters during fine-tuning include:
- Learning rate: 1e-5
- Batch size: 32
- Warmup steps: 600
- Maximum sequence length: 128
- Number of training epochs: 3
The model achieves high accuracy and precision metrics on the dataset, making it suitable for sentiment analysis tasks.
Guide: Running Locally
To run the model locally, follow these steps:
-
Install the Transformers library from Hugging Face using pip:
pip install transformers torch
-
Load the model and tokenizer in your Python script:
import torch from transformers import DistilBertTokenizer, DistilBertForSequenceClassification tokenizer = DistilBertTokenizer.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english") model = DistilBertForSequenceClassification.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")
-
Prepare your input text and perform inference:
inputs = tokenizer("Hello, my dog is cute", return_tensors="pt") with torch.no_grad(): logits = model(**inputs).logits predicted_class_id = logits.argmax().item() print(model.config.id2label[predicted_class_id])
For better performance, especially on larger datasets, consider using cloud GPU services such as AWS, Google Cloud, or Azure.
License
The model is licensed under the Apache-2.0 License, which allows for both commercial and non-commercial use, modification, and distribution.