english sarcasm detector
helinivanIntroduction
The English Sarcasm Detector is a text classification model designed to identify sarcasm in news article titles. It is built upon the bert-base-uncased
model and fine-tuned using a dataset from Kaggle. The model distinguishes between sarcastic (label 1) and non-sarcastic (label 0) content.
Architecture
This sarcasm detector is based on the BERT architecture, specifically the bert-base-uncased
variant. It utilizes the capabilities of Transformers and PyTorch to perform text classification tasks, offering robust predictions with high accuracy.
Training
The model is trained using the "News Headlines Dataset For Sarcasm Detection" available on Kaggle. The specific dataset used for training is helinivan/sarcasm_headlines_multilingual
. The model achieves high performance with an F1 score of 92.38 and accuracy of 92.42.
Guide: Running Locally
-
Install Dependencies: Ensure you have
transformers
andtorch
installed in your Python environment.pip install transformers torch
-
Preprocess Data: Lowercase the text and remove punctuation to prepare it for tokenization.
-
Load Model and Tokenizer: Use the
helinivan/english-sarcasm-detector
model path withAutoTokenizer
andAutoModelForSequenceClassification
.from transformers import AutoTokenizer, AutoModelForSequenceClassification MODEL_PATH = "helinivan/english-sarcasm-detector" tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH) model = AutoModelForSequenceClassification.from_pretrained(MODEL_PATH)
-
Tokenize Input: Tokenize your text input with appropriate padding and truncation.
-
Make Predictions: Pass the tokenized data through the model to obtain predictions and confidence scores.
For optimal performance, consider utilizing cloud GPUs from providers like AWS or Google Cloud, which offer scalable computing resources for model inference.
License
The licensing information for the English Sarcasm Detector model is not specified in the provided content. For detailed licensing terms, refer to the official repository or model card on Hugging Face's platform.