Llama 3 8 B Instruct Finance R A G G G U F

QuantFactory

Introduction

The LLAMA-3-8B-INSTRUCT-FINANCE-RAG-GGUF model by QuantFactory is a quantized version of the Llama-3-8B-Instruct-Finance-RAG model, designed for financial question-answering tasks. It is fine-tuned to answer questions using provided context data.

Architecture

This model is based on the Meta-Llama-3-8B-Instruct architecture. It uses a LoRA adapter to optimize the model for retrieval-augmented generation (RAG) tasks, specifically in the financial domain.

Training

The model is fine-tuned on 4,000 examples from the virattt/financial-qa-10K dataset. The training process involves using a LoRA adapter for enhancing performance in specific question-answering contexts.

Guide: Running Locally

  1. Setup Environment

    • Install the required libraries, such as transformers and torch.
    • Ensure you have Python installed.
  2. Load the Model

    from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
    
    MODEL_NAME = "curiousily/Llama-3-8B-Instruct-Finance-RAG"
    tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME, use_fast=True)
    model = AutoModelForCausalLM.from_pretrained(MODEL_NAME, device_map="auto")
    
    pipe = pipeline(
        task="text-generation",
        model=model,
        tokenizer=tokenizer,
        max_new_tokens=128,
        return_full_text=False,
    )
    
  3. Create a Prompt

    • Format your prompt using the Instruct prompt format.
    • Example prompt:
      prompt = """
      <|begin_of_text|><|start_header_id|>system<|end_header_id|>
      Use only the information to answer the question<|eot_id|><|start_header_id|>user<|end_header_id|>
      How much did the company's net earnings amount to in fiscal 2022?
      Information:
      
      Net earnings were $17.1 billion in fiscal 2022.
      """
      
  4. Make Predictions

    • Use the model to generate predictions based on the prompt.
  5. Cloud GPUs

    • For improved performance, consider using cloud GPU services such as AWS EC2, Google Cloud Platform, or Azure for running the model.

License

The model uses the original Llama 3 License. A custom commercial license is available at Llama License.

More Related APIs in Text Generation