Llama 3 8 B Instruct Finance R A G G G U F LLM Model

Introduction

The LLAMA-3-8B-INSTRUCT-FINANCE-RAG-GGUF model by QuantFactory is a quantized version of the Llama-3-8B-Instruct-Finance-RAG model, designed for financial question-answering tasks. It is fine-tuned to answer questions using provided context data.

Architecture

This model is based on the Meta-Llama-3-8B-Instruct architecture. It uses a LoRA adapter to optimize the model for retrieval-augmented generation (RAG) tasks, specifically in the financial domain.

Training

The model is fine-tuned on 4,000 examples from the virattt/financial-qa-10K dataset. The training process involves using a LoRA adapter for enhancing performance in specific question-answering contexts.

Guide: Running Locally

Setup Environment
- Install the required libraries, such as transformers and torch.
- Ensure you have Python installed.

Load the Model

from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline

MODEL_NAME = "curiousily/Llama-3-8B-Instruct-Finance-RAG"
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME, use_fast=True)
model = AutoModelForCausalLM.from_pretrained(MODEL_NAME, device_map="auto")

pipe = pipeline(
    task="text-generation",
    model=model,
    tokenizer=tokenizer,
    max_new_tokens=128,
    return_full_text=False,
)

Create a Prompt

Format your prompt using the Instruct prompt format.

Example prompt:

prompt = """
<|begin_of_text|><|start_header_id|>system<|end_header_id|>
Use only the information to answer the question<|eot_id|><|start_header_id|>user<|end_header_id|>
How much did the company's net earnings amount to in fiscal 2022?
Information:

Net earnings were $17.1 billion in fiscal 2022.

"""

Make Predictions
- Use the model to generate predictions based on the prompt.
Cloud GPUs
- For improved performance, consider using cloud GPU services such as AWS EC2, Google Cloud Platform, or Azure for running the model.

License

The model uses the original Llama 3 License. A custom commercial license is available at Llama License.

More Related APIs in Text Generation