distilbert base multilingual cased sentiments student

lxyuan

DISTILBERT-BASE-MULTILINGUAL-CASED-SENTIMENTS-STUDENT

Introduction

The DISTILBERT-BASE-MULTILINGUAL-CASED-SENTIMENTS-STUDENT model is designed for multilingual sentiment analysis. It is distilled from a zero-shot classification pipeline using the Multilingual Sentiment dataset. The student model is trained using the teacher model, MoritzLaurer/mDeBERTa-v3-base-mnli-xnli, with the hypothesis template "The sentiment of this text is {}."

Architecture

The model is based on the DistilBERT architecture, a smaller, faster version of the BERT model, optimized for performance in sentiment analysis across 12 languages including English, Arabic, German, and Japanese. The model supports zero-shot classification and distillation techniques.

Training

The training process involves distillation from a teacher model using a specified script. The training hyperparameters include a batch size of 32 for the teacher and 16 for the student, with training conducted using mixed precision (fp16) to improve efficiency. The training results show a runtime of approximately 2009 seconds with an agreement of 88.29% between student and teacher predictions.

Guide: Running Locally

To run the model locally, follow these steps:

  1. Install Dependencies: Ensure you have the required libraries: Transformers, PyTorch, Datasets, and Tokenizers.

  2. Setup Environment: Use a cloud GPU service such as Google Colab or AWS EC2 for optimal performance, as training on a local machine might be resource-intensive.

  3. Inference Example:

    from transformers import pipeline
    
    distilled_student_sentiment_classifier = pipeline(
        model="lxyuan/distilbert-base-multilingual-cased-sentiments-student",
        return_all_scores=True
    )
    
    result = distilled_student_sentiment_classifier("I love this movie and i would watch it again and again!")
    print(result)
    
  4. Run Training: Use the provided training script and modify it as needed for your environment, especially if using Colab to avoid memory errors.

License

This model is licensed under the Apache 2.0 License, allowing for both personal and commercial use, distribution, and modification under the same licensing terms.

More Related APIs in Text Classification