bert tiny Massive intent K D B E R T
gokulsIntroduction
BERT-TINY-MASSIVE-INTENT-KD-BERT is a fine-tuned version of Google's BERT model, specifically google/bert_uncased_L-2_H-128_A-2
, trained on the MASSIVE dataset for text classification tasks. It achieves a loss of 0.8380 and an accuracy of 0.8534 on the evaluation set.
Architecture
This model is based on the BERT architecture, known for its transformer-based design, which makes it suitable for various natural language processing tasks such as text classification.
Training
The model was trained using the following hyperparameters:
- Learning Rate: 5e-05
- Train Batch Size: 16
- Eval Batch Size: 16
- Seed: 33
- Optimizer: Adam (betas=(0.9,0.999), epsilon=1e-08)
- Learning Rate Scheduler Type: Linear
- Number of Epochs: 50
- Mixed Precision Training: Native AMP
Training results showed progressive improvement in accuracy, reaching 0.8534, with loss decreasing over the epochs.
Guide: Running Locally
To run this model locally, follow these steps:
-
Install Required Packages:
- Ensure you have Python installed.
- Use pip to install the necessary libraries:
transformers
,torch
,datasets
, andtokenizers
.
-
Download the Model:
- You can download the model from the Hugging Face model hub.
-
Set Up Your Environment:
- It's recommended to run the model on a machine with a GPU for better performance. Consider using cloud GPU services such as AWS, GCP, or Azure.
-
Execute the Model:
- Use the Transformers library to load and run the model on your text data for classification tasks.
License
This model is licensed under the Apache 2.0 License, which allows for both personal and commercial use, modification, and distribution.