finbert esg
yiyanghkustIntroduction
FinBERT-ESG is a fine-tuned version of the FinBERT model aimed at analyzing financial texts with a focus on Environmental, Social, and Governance (ESG) factors. It is trained on 2,000 manually annotated sentences derived from firms' ESG and annual reports, providing classifications into Environmental, Social, Governance, or None categories.
Architecture
FinBERT-ESG is based on the BERT architecture and utilizes PyTorch and the Transformers library. It is specifically designed for text classification tasks in the financial domain, particularly ESG analysis.
Training
The model was fine-tuned on a dataset of 2,000 sentences manually annotated for ESG content. This training enables the model to effectively classify financial texts into specific ESG categories, aiding in the evaluation of a business's sustainability and risk profile.
Guide: Running Locally
To run FinBERT-ESG locally, follow these steps:
- Install the Transformers library (version 4.18.0 or later recommended).
- Import the necessary components in your Python environment:
from transformers import BertTokenizer, BertForSequenceClassification, pipeline
- Load the model and tokenizer:
finbert = BertForSequenceClassification.from_pretrained('yiyanghkust/finbert-esg', num_labels=4) tokenizer = BertTokenizer.from_pretrained('yiyanghkust/finbert-esg')
- Initialize the text classification pipeline:
nlp = pipeline("text-classification", model=finbert, tokenizer=tokenizer)
- Classify a financial text example:
results = nlp('Rhonda has been volunteering for several years for a variety of charitable community programs.') print(results) # Output: [{'label': 'Social', 'score': 0.9906041026115417}]
For enhanced performance, consider using cloud-based GPUs from providers like AWS, Google Cloud, or Azure.
License
Please refer to the model's page on Hugging Face for specific licensing details. If you utilize this model in academic research, cite the paper: Huang, Allen H., Hui Wang, and Yi Yang. "FinBERT: A Large Language Model for Extracting Information from Financial Text." Contemporary Accounting Research (2022).