roberta argument
chklaIntroduction
RoBERTArg is a text classification model designed to categorize sentences into either an ARGUMENT or NON-ARGUMENT. It was trained on a dataset of manually annotated sentences from controversial topics to enhance the field of argument mining.
Architecture
RoBERTArg is based on the RoBERTa (base) architecture. It leverages the robust pre-training of RoBERTa, fine-tuned specifically for identifying argumentative structures within text.
Training
The model was fine-tuned using approximately 25,000 sentences from a dataset compiled by Stab et al. (2018). The dataset includes sentences from eight controversial topics that are labeled as either ARGUMENTS or NON-ARGUMENTS. The fine-tuning process employed specific hyperparameters such as a learning rate of 2.3102e-06, a batch size of 64, and a training duration of two epochs. The model achieved an accuracy of 81.93% on a reserved evaluation set.
Guide: Running Locally
- Install Dependencies: Make sure to have Python and PyTorch installed. Use pip to install the Hugging Face Transformers library.
- Download the Model: Retrieve the RoBERTArg model from the Hugging Face model hub.
- Load the Model: Utilize the Transformers library to load the model and tokenizer.
- Run Inference: Input your text data to classify it as ARGUMENT or NON-ARGUMENT.
For enhanced performance, consider using cloud GPU services such as AWS, Google Cloud, or Azure.
License
The RoBERTArg model and its associated resources are subject to the licensing terms specified by the original creators. Ensure compliance with these terms when using or distributing the model.