bert tiny finetuned enron spam detection

mrm8488

Introduction

The BERT-TINY-FINETUNED-ENRON-SPAM-DETECTION model is a fine-tuned version of the BERT-Tiny model, specifically tailored for spam detection tasks using the SetFit/enron_spam dataset. The model is designed to classify email text as spam or not with high precision and recall.

Architecture

The model is based on the Google BERT-Tiny architecture, which is a smaller variant of the original BERT model. It uses only 2 layers with a hidden size of 128 and 2 attention heads, making it efficient for tasks with limited computational resources.

Training

The model was trained using the following hyperparameters:

  • Learning Rate: 2e-05
  • Train Batch Size: 16
  • Evaluation Batch Size: 32
  • Seed: 42
  • Optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • Learning Rate Scheduler: Linear
  • Number of Epochs: 4

The training was conducted using the Enron spam dataset, and the model achieved the following results on the evaluation set:

  • Loss: 0.0593
  • Precision: 0.9851
  • Recall: 0.9871
  • Accuracy: 0.986
  • F1 Score: 0.9861

Guide: Running Locally

To run the model locally, follow these steps:

  1. Install Dependencies: Ensure you have the following Python packages installed:

    • Transformers 4.23.1
    • PyTorch 1.12.1+cu113
    • Datasets 2.6.1
    • Tokenizers 0.13.1
  2. Clone the Repository: Clone the model repository from Hugging Face to your local machine.

  3. Load the Model: Use the Transformers library to load the model and tokenizer:

    from transformers import BertTokenizer, BertForSequenceClassification
    
    tokenizer = BertTokenizer.from_pretrained('mrm8488/bert-tiny-finetuned-enron-spam-detection')
    model = BertForSequenceClassification.from_pretrained('mrm8488/bert-tiny-finetuned-enron-spam-detection')
    
  4. Inference: Prepare your input text, tokenize it, and run inference through the model.

For improved performance, consider using cloud GPU services such as AWS EC2, Google Cloud, or Azure.

License

The model is licensed under the Apache-2.0 license. This allows for both personal and commercial use, modification, and distribution, with proper credit to the original authors.

More Related APIs in Text Classification