Introduction
FinBERT is a BERT model specifically pre-trained on financial communication texts to improve NLP research and application in the financial sector. It has been trained on a substantial corpus of 4.9 billion tokens, consisting of corporate reports, earnings call transcripts, and analyst reports. The finbert-tone model, a version of FinBERT, is fine-tuned for financial tone analysis using manually annotated sentences.

Architecture
The FinBERT model is built on the BERT architecture and fine-tuned for sentiment analysis in the financial domain. It utilizes a classification layer to determine the tone of financial texts as positive, negative, or neutral. The fine-tuning dataset consists of 10,000 annotated sentences.

Training
FinBERT was pre-trained on a massive dataset of financial texts:

  • Corporate Reports (10-K & 10-Q): 2.5 billion tokens
  • Earnings Call Transcripts: 1.3 billion tokens
  • Analyst Reports: 1.1 billion tokens

The finbert-tone model was fine-tuned on sentences annotated for sentiment analysis to achieve high performance in financial tone analysis tasks.

Guide: Running Locally
To use the FinBERT-Tone model locally for sentiment analysis, follow these steps:

  1. Installation:
    Ensure you have the transformers library installed. You can install it using pip:

    pip install transformers
    
  2. Load the Model and Tokenizer:

    from transformers import BertTokenizer, BertForSequenceClassification, pipeline
    
    finbert = BertForSequenceClassification.from_pretrained('yiyanghkust/finbert-tone', num_labels=3)
    tokenizer = BertTokenizer.from_pretrained('yiyanghkust/finbert-tone')
    
  3. Pipeline Setup:

    nlp = pipeline("sentiment-analysis", model=finbert, tokenizer=tokenizer)
    
  4. Sentiment Analysis:

    sentences = [
        "there is a shortage of capital, and we need extra financing",
        "growth is strong and we have plenty of liquidity",
        "there are doubts about our finances",
        "profits are flat"
    ]
    results = nlp(sentences)
    print(results)  # LABEL_0: neutral; LABEL_1: positive; LABEL_2: negative
    

For optimal performance, consider using a cloud GPU service such as AWS, Google Cloud, or Azure to handle the computation.

License
For licensing details, please refer to the original repository or the model card on Hugging Face. Ensure compliance with any usage terms specified.

More Related APIs in Text Classification