sabia 7b G G U F

QuantFactory

Introduction

Sabiá-7B is a Portuguese language model developed by Maritaca AI. It is designed for text generation tasks and is based on the LLaMA-1-7B architecture. The model has been quantized using llama.cpp for efficiency.

Architecture

Sabiá-7B is an auto-regressive language model utilizing the LLaMA-1-7B architecture. It employs the same tokenizer as LLaMA-1-7B and supports a maximum sequence length of 2048 tokens.

Training

The model was pretrained on the Portuguese subset of ClueWeb22, starting with the weights of LLaMA-1-7B. It was further trained on 10 billion additional tokens, covering approximately 1.4 epochs of the dataset. The data used has a cutoff of mid-2022.

Guide: Running Locally

To run Sabiá-7B locally, follow these steps:

  1. Install Dependencies: Ensure you have transformers and accelerate installed.

    pip install transformers accelerate
    
  2. Load Model and Tokenizer:

    import torch
    from transformers import LlamaTokenizer, LlamaForCausalLM
    
    tokenizer = LlamaTokenizer.from_pretrained("maritaca-ai/sabia-7b")
    model = LlamaForCausalLM.from_pretrained(
        "maritaca-ai/sabia-7b",
        device_map="auto",
        low_cpu_mem_usage=True,
        torch_dtype=torch.bfloat16
    )
    
  3. Prepare Input: Create a text prompt for generation.

    prompt = "Classifique a resenha de filme como 'positiva' ou 'negativa'.\n"
    input_ids = tokenizer(prompt, return_tensors="pt")
    
  4. Generate Output:

    output = model.generate(
        input_ids["input_ids"].to("cuda"),
        max_length=1024,
        eos_token_id=tokenizer.encode("\n")
    )
    
  5. Cloud GPUs: Consider using cloud services like AWS, GCP, or Azure for access to GPUs.

For systems with limited GPU memory, consider using 8-bit loading by installing bitsandbytes and setting load_in_8bit=True during model loading.

License

Sabiá-7B is licensed similarly to LLaMA-1, restricting its use to research purposes only. For more details, refer to the licensing terms associated with LLaMA-1.

More Related APIs