roberta base finetuned yelp polarity LLM Model

Introduction

This document provides details about the RoBERTa-base model fine-tuned for binary sentiment classification on Yelp polarity data. The model achieves an accuracy of 98.08% on the test set.

Architecture

The model is based on the RoBERTa-base architecture, which is a transformer model optimized for natural language processing tasks. The fine-tuning was performed specifically for sentiment classification using the Yelp polarity dataset.

Training

The model was trained using the following hyper-parameters:

Number of training epochs: 2.0
Learning rate: 1e-05
Weight decay: 0.0
Adam epsilon: 1e-08
Maximum gradient norm: 1.0
Per device training batch size: 32
Gradient accumulation steps: 1
Warmup steps: 3500
Seed: 42

Training was conducted on a single GPU to optimize performance and resource usage.

Guide: Running Locally

To run the model locally, follow these steps:

Install the Transformers Library
Ensure you have the transformers library installed. You can install it via pip:
```
pip install transformers
```

Load the Model
Use the following Python code to load the model:

from transformers import RobertaTokenizer, RobertaForSequenceClassification
model_name = "VictorSanh/roberta-base-finetuned-yelp-polarity"
tokenizer = RobertaTokenizer.from_pretrained(model_name)
model = RobertaForSequenceClassification.from_pretrained(model_name)

Inference
You can use the model for inference by tokenizing input text and running it through the model.
Hardware Suggestions
For optimal performance, especially with larger datasets or batch sizes, consider using a cloud GPU service such as AWS EC2, Google Cloud's Compute Engine, or Azure's GPU offerings.

License

The model and code are available under the Apache License 2.0, allowing for both personal and commercial use with attribution.

More Related APIs in Text Classification