distilroberta finetuned financial text classification
nickmuchiIntroduction
The DistilRoBERTa model fine-tuned for financial text classification is designed to determine the sentiment of financial news articles. It evaluates text to classify sentiment as negative, neutral, or positive. The model incorporates financial news data, including Covid-19 sentiment information, to enhance its prediction capabilities.
Architecture
This model is based on the distilroberta-base
architecture. It has been fine-tuned on the financial-phrasebank
and a Kaggle dataset that includes sentiments related to Covid-19 impacts on financial markets. The model pays special attention to less frequently sampled labels by adjusting class weights, aiming to improve overall performance.
Training
The model was trained using the following hyperparameters:
- Learning Rate: 2e-05
- Train Batch Size: 64
- Evaluation Batch Size: 64
- Seed: 42
- Optimizer: Adam (betas=(0.9, 0.999), epsilon=1e-08)
- Learning Rate Scheduler: Linear
- Epochs: 10
- Mixed Precision Training: Native AMP
Training results indicated an F1 score of 0.8835, with a loss of 0.4463 on the evaluation set. The training used various frameworks: Transformers 4.15.0, PyTorch 1.10.0+cu111, Datasets 1.18.0, and Tokenizers 0.10.3.
Guide: Running Locally
To run the model locally, follow these steps:
-
Clone the Repository:
git clone https://huggingface.co/nickmuchi/distilroberta-finetuned-financial-text-classification cd distilroberta-finetuned-financial-text-classification
-
Install Dependencies: Ensure you have Python installed, then install the necessary packages:
pip install transformers torch datasets
-
Run the Model: Load and run the model:
from transformers import pipeline classifier = pipeline("text-classification", model="nickmuchi/distilroberta-finetuned-financial-text-classification") result = classifier("The USD rallied by 10% last night") print(result)
-
Use Cloud GPUs: For improved performance, consider using cloud GPU providers such as AWS EC2, Google Cloud, or Azure.
License
This model is licensed under the Apache 2.0 License, which allows for both personal and commercial use with proper attribution.