roberta_toxicity_classifier
s-nlpRoBERTa Toxicity Classifier
Introduction
The RoBERTa Toxicity Classifier is designed to classify text as either toxic or neutral. It is built upon the RoBERTa model architecture and fine-tuned using a dataset from Jigsaw, which includes data from competitions held in 2018, 2019, and 2020.
Architecture
The model is based on the RoBERTa architecture, specifically the FacebookAI/roberta-large
variant. RoBERTa, which stands for "Robustly optimized BERT approach," is an enhancement of the BERT model, providing improved performance on various natural language processing tasks.
Training
The model was fine-tuned on a dataset consisting of about 2 million examples from the Jigsaw toxicity comment classification challenges. This dataset was split, and the model was trained to achieve an AUC-ROC of 0.98 and an F1-score of 0.76 on the test set from the first Jigsaw competition.
Guide: Running Locally
To run the RoBERTa Toxicity Classifier locally, follow these steps:
-
Install Dependencies: Ensure you have PyTorch and the Transformers library installed.
pip install torch transformers
-
Load the Model and Tokenizer:
import torch from transformers import RobertaTokenizer, RobertaForSequenceClassification tokenizer = RobertaTokenizer.from_pretrained('s-nlp/roberta_toxicity_classifier') model = RobertaForSequenceClassification.from_pretrained('s-nlp/roberta_toxicity_classifier')
-
Perform Inference:
batch = tokenizer.encode("You are amazing!", return_tensors="pt") output = model(batch) # idx 0 for neutral, idx 1 for toxic
For enhanced performance, consider utilizing cloud GPUs from providers such as AWS, Google Cloud, or Azure.
License
This model is distributed under the OpenRAIL++ License, which facilitates the development of technologies serving the public good in both industrial and academic contexts.