roberta base M R P C
textattackIntroduction
The TextAttack model, based on the RoBERTa-base architecture, is fine-tuned for sequence classification using the GLUE dataset. It is specifically trained on the MRPC (Microsoft Research Paraphrase Corpus) task.
Architecture
The model uses the RoBERTa-base transformer architecture, a variant of BERT, optimized for better performance on downstream tasks like text classification.
Training
The model is fine-tuned using TextAttack and the nlp library for 5 epochs. Training parameters include:
- Batch size: 16
- Learning rate: 3e-05
- Maximum sequence length: 256
- Loss function: Cross-entropy
The model achieved an accuracy of 0.9118 on the evaluation set after 2 epochs.
Guide: Running Locally
-
Clone the Repository:
Clone the TextAttack repository from GitHub. -
Install Dependencies:
Ensure you have Python installed. Runpip install textattack
to install the required libraries. -
Download the Model:
Use Hugging Face's Transformers library to download and load the model:from transformers import AutoModelForSequenceClassification, AutoTokenizer model = AutoModelForSequenceClassification.from_pretrained("textattack/roberta-base-MRPC") tokenizer = AutoTokenizer.from_pretrained("textattack/roberta-base-MRPC")
-
Run Inference:
Tokenize your input and perform inference using the model. -
Optimize with Cloud GPUs:
For faster training and inference, consider using cloud GPU services such as AWS EC2, Google Cloud, or Azure.
License
The model and related code are subject to the licensing terms provided in the TextAttack GitHub repository. Refer to the GitHub repository for detailed licensing information.