roberta base finetuned yelp polarity
VictorSanhIntroduction
This document provides details about the RoBERTa-base model fine-tuned for binary sentiment classification on Yelp polarity data. The model achieves an accuracy of 98.08% on the test set.
Architecture
The model is based on the RoBERTa-base architecture, which is a transformer model optimized for natural language processing tasks. The fine-tuning was performed specifically for sentiment classification using the Yelp polarity dataset.
Training
The model was trained using the following hyper-parameters:
- Number of training epochs: 2.0
- Learning rate: 1e-05
- Weight decay: 0.0
- Adam epsilon: 1e-08
- Maximum gradient norm: 1.0
- Per device training batch size: 32
- Gradient accumulation steps: 1
- Warmup steps: 3500
- Seed: 42
Training was conducted on a single GPU to optimize performance and resource usage.
Guide: Running Locally
To run the model locally, follow these steps:
-
Install the Transformers Library
Ensure you have thetransformers
library installed. You can install it via pip:pip install transformers
-
Load the Model
Use the following Python code to load the model:from transformers import RobertaTokenizer, RobertaForSequenceClassification model_name = "VictorSanh/roberta-base-finetuned-yelp-polarity" tokenizer = RobertaTokenizer.from_pretrained(model_name) model = RobertaForSequenceClassification.from_pretrained(model_name)
-
Inference
You can use the model for inference by tokenizing input text and running it through the model. -
Hardware Suggestions
For optimal performance, especially with larger datasets or batch sizes, consider using a cloud GPU service such as AWS EC2, Google Cloud's Compute Engine, or Azure's GPU offerings.
License
The model and code are available under the Apache License 2.0, allowing for both personal and commercial use with attribution.