roberta_toxicity_classifier

s-nlp

RoBERTa Toxicity Classifier

Introduction

The RoBERTa Toxicity Classifier is designed to classify text as either toxic or neutral. It is built upon the RoBERTa model architecture and fine-tuned using a dataset from Jigsaw, which includes data from competitions held in 2018, 2019, and 2020.

Architecture

The model is based on the RoBERTa architecture, specifically the FacebookAI/roberta-large variant. RoBERTa, which stands for "Robustly optimized BERT approach," is an enhancement of the BERT model, providing improved performance on various natural language processing tasks.

Training

The model was fine-tuned on a dataset consisting of about 2 million examples from the Jigsaw toxicity comment classification challenges. This dataset was split, and the model was trained to achieve an AUC-ROC of 0.98 and an F1-score of 0.76 on the test set from the first Jigsaw competition.

Guide: Running Locally

To run the RoBERTa Toxicity Classifier locally, follow these steps:

  1. Install Dependencies: Ensure you have PyTorch and the Transformers library installed.

    pip install torch transformers
    
  2. Load the Model and Tokenizer:

    import torch
    from transformers import RobertaTokenizer, RobertaForSequenceClassification
    
    tokenizer = RobertaTokenizer.from_pretrained('s-nlp/roberta_toxicity_classifier')
    model = RobertaForSequenceClassification.from_pretrained('s-nlp/roberta_toxicity_classifier')
    
  3. Perform Inference:

    batch = tokenizer.encode("You are amazing!", return_tensors="pt")
    output = model(batch)
    # idx 0 for neutral, idx 1 for toxic
    

For enhanced performance, consider utilizing cloud GPUs from providers such as AWS, Google Cloud, or Azure.

License

This model is distributed under the OpenRAIL++ License, which facilitates the development of technologies serving the public good in both industrial and academic contexts.

More Related APIs in Text Classification