Text Moderation LLM Model

Introduction

The Text Moderation model by KoalaAI is a text classification model designed to predict whether text contains offensive content. It uses Deberta-v3 and categorizes content into several labels such as sexual, hate, violence, harassment, self-harm, and others. It is trained on English texts and may not perform well with non-English inputs.

Architecture

This model is built on the Deberta-v3 architecture, focusing on text classification. It assigns labels to text based on predefined categories, which include sexual content, hate speech, violence, harassment, and more. The model is specifically designed to detect potentially offensive content.

Training

The model was trained with AutoTrain using datasets like mmathys/openai-moderation-api-evaluation. The problem type is multi-class classification, with CO2 emissions reported at 0.0397 grams. Key validation metrics include:

Loss: 0.848
Accuracy: 75%
Macro F1: 0.326
Weighted F1: 0.703

Guide: Running Locally

To run the model locally, use either cURL or the Python API.

Using cURL

$ curl -X POST -H "Authorization: Bearer YOUR_API_KEY" -H "Content-Type: application/json" -d '{"inputs": "I love AutoTrain"}' https://api-inference.huggingface.co/models/KoalaAI/Text-Moderation

Using Python API

from transformers import AutoModelForSequenceClassification, AutoTokenizer

model = AutoModelForSequenceClassification.from_pretrained("KoalaAI/Text-Moderation")
tokenizer = AutoTokenizer.from_pretrained("KoalaAI/Text-Moderation")

inputs = tokenizer("I love AutoTrain", return_tensors="pt")
outputs = model(**inputs)

logits = outputs.logits
probabilities = logits.softmax(dim=-1).squeeze()

id2label = model.config.id2label
labels = [id2label[idx] for idx in range(len(probabilities))]

label_prob_pairs = list(zip(labels, probabilities))
label_prob_pairs.sort(key=lambda item: item[1], reverse=True)

for label, probability in label_prob_pairs:
    print(f"Label: {label} - Probability: {probability:.4f}")

Cloud GPUs

To improve performance, consider using cloud GPUs from providers like AWS, Google Cloud, or Azure.

License

The model is licensed under the CodeML OpenRAIL-M 0.1 license. This license allows free access, use, modification, and distribution for both commercial and non-commercial purposes, provided certain conditions are met. These include not using the model for unlawful or harmful purposes, respecting privacy, and acknowledging the model's source. The model is provided "as is" without warranties, and the licensor holds no liability for any resulting damages.

More Related APIs in Text Classification