t5 base finetuned sarcasm twitter

mrm8488

Introduction

The T5-BASE-FINETUNED-SARCASM-TWITTER model is a fine-tuned version of Google's T5 base model, specifically designed for sarcasm detection on Twitter. It leverages the capabilities of transfer learning to perform sequence classification as a text generation task.

Architecture

The model employs the T5 architecture, which is a unified text-to-text transformer framework. This approach allows for converting various NLP problems into a text-to-text format, facilitating effective transfer learning. The model was initially pre-trained on a data-rich task before being fine-tuned on a downstream task using the Twitter Sarcasm Dataset.

Training

Fine-tuning was conducted using a modified version of a training script from Suraj Patil's work. The training utilized the Twitter Sarcasm Dataset, which comprises conversations labeled for sarcasm detection. The dataset was preprocessed to fit the text-to-text classification task. The model achieved an accuracy of 83% on the test set, with balanced precision, recall, and F1-scores.

Guide: Running Locally

  1. Install Required Libraries: Ensure you have the transformers library installed.

    pip install transformers
    
  2. Load the Model and Tokenizer:

    from transformers import AutoTokenizer, AutoModelWithLMHead
    
    tokenizer = AutoTokenizer.from_pretrained("mrm8488/t5-base-finetuned-sarcasm-twitter")
    model = AutoModelWithLMHead.from_pretrained("mrm8488/t5-base-finetuned-sarcasm-twitter")
    
  3. Run Sarcasm Detection: Define a function to evaluate sarcasm in conversation text using the model.

    def eval_conversation(text):
        input_ids = tokenizer.encode(text + '</s>', return_tensors='pt')
        output = model.generate(input_ids=input_ids, max_length=3)
        dec = [tokenizer.decode(ids) for ids in output]
        return dec[0]
    
  4. Evaluate Conversations: Use the eval_conversation function on Twitter dialogues to detect sarcasm.

  5. Cloud GPUs: For enhanced performance, consider using cloud GPU platforms such as AWS EC2, Google Cloud Platform, or Azure for model inference.

License

The model and its associated code are created by Manuel Romero and are likely subject to Hugging Face's model sharing policies. Users should check specific licensing terms on the Hugging Face model page for compliance and usage limitations.

More Related APIs in Text2text Generation