distilroberta finetuned financial news sentiment analysis
mrm8488Introduction
The distilroberta-finetuned-financial-news-sentiment-analysis
model is a fine-tuned version of the DistilRoBERTa-base model, designed to perform sentiment analysis on financial news. It is particularly tailored for text classification tasks, achieving high accuracy on the financial_phrasebank dataset.
Architecture
This model is a distilled version of the RoBERTa-base model, following the training procedures similar to DistilBERT. It consists of 6 layers, each with 768 dimensions and 12 attention heads, amounting to 82 million parameters. This makes it more efficient, being twice as fast as the original RoBERTa-base model, which has 125 million parameters.
Training
The model was trained on the financial_phrasebank dataset, comprising 4,840 sentences of financial news, labeled by sentiment. These sentences were annotated by 5-8 annotators. The training utilized the following hyperparameters:
- Learning Rate: 2e-05
- Train Batch Size: 8
- Eval Batch Size: 8
- Seed: 42
- Optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- Learning Rate Scheduler Type: Linear
- Number of Epochs: 5
Guide: Running Locally
- Environment Setup: Ensure you have Python and the necessary dependencies installed. It's recommended to create a virtual environment.
- Install Dependencies: Use pip to install
transformers
,torch
, anddatasets
.pip install transformers torch datasets
- Load the Model: Use the Hugging Face Transformers library to load the model.
from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("mrm8488/distilroberta-finetuned-financial-news-sentiment-analysis") model = AutoModelForSequenceClassification.from_pretrained("mrm8488/distilroberta-finetuned-financial-news-sentiment-analysis")
- Inference: Tokenize input text and perform sentiment analysis.
inputs = tokenizer("Your financial news headline here", return_tensors="pt") outputs = model(**inputs)
- Consider Cloud GPUs: For faster inference, use cloud services like AWS, GCP, or Azure, which offer GPU support.
License
This model is licensed under the Apache 2.0 License, allowing for both commercial and non-commercial use.