rotten tomatoes model
klin1Introduction
The rotten-tomatoes-model
is a text classification model based on the bert-base-cased
architecture, fine-tuned on the Rotten Tomatoes dataset. It predicts the sentiment of movie reviews, labeling them as either negative (LABEL_0) or positive (LABEL_1).
Architecture
The model utilizes the bert-base-cased
architecture, which is part of the BERT (Bidirectional Encoder Representations from Transformers) family. It is implemented using the Transformers library, leveraging TensorFlow for model training and deployment.
Training
The model was fine-tuned using the Rotten Tomatoes dataset, consisting of 5,331 positive and 5,331 negative movie reviews. The training process achieved the following results over three epochs:
- Epoch 0: Train Loss: 0.4028, Train Accuracy: 0.8213, Validation Loss: 0.4626, Validation Accuracy: 0.8433
- Epoch 1: Train Loss: 0.1628, Train Accuracy: 0.9390, Validation Loss: 0.3498, Validation Accuracy: 0.8696
- Epoch 2: Train Loss: 0.0386, Train Accuracy: 0.9878, Validation Loss: 0.4790, Validation Accuracy: 0.8621
Guide: Running Locally
- Setup Environment: Install the necessary libraries.
pip install transformers==4.18.0 tensorflow==2.8.0 datasets==2.1.0 tokenizers==0.12.1
- Download the Model: Use the Hugging Face Transformers library to load the model.
from transformers import BertTokenizer, TFBertForSequenceClassification tokenizer = BertTokenizer.from_pretrained('klin1/rotten-tomatoes-model') model = TFBertForSequenceClassification.from_pretrained('klin1/rotten-tomatoes-model')
- Inference: Tokenize and classify your text inputs.
inputs = tokenizer("Your movie review here", return_tensors="tf") outputs = model(inputs) predictions = tf.nn.softmax(outputs.logits, axis=-1)
Cloud GPUs: For efficient training and inference, consider using cloud-based GPUs like AWS EC2 with NVIDIA GPUs, Google Cloud Platform, or Azure.
License
The rotten-tomatoes-model
is licensed under the Apache 2.0 License, allowing for free use and distribution under the terms of the license.