sabia 7b G G U F
QuantFactoryIntroduction
Sabiá-7B is a Portuguese language model developed by Maritaca AI. It is designed for text generation tasks and is based on the LLaMA-1-7B architecture. The model has been quantized using llama.cpp for efficiency.
Architecture
Sabiá-7B is an auto-regressive language model utilizing the LLaMA-1-7B architecture. It employs the same tokenizer as LLaMA-1-7B and supports a maximum sequence length of 2048 tokens.
Training
The model was pretrained on the Portuguese subset of ClueWeb22, starting with the weights of LLaMA-1-7B. It was further trained on 10 billion additional tokens, covering approximately 1.4 epochs of the dataset. The data used has a cutoff of mid-2022.
Guide: Running Locally
To run Sabiá-7B locally, follow these steps:
-
Install Dependencies: Ensure you have
transformers
andaccelerate
installed.pip install transformers accelerate
-
Load Model and Tokenizer:
import torch from transformers import LlamaTokenizer, LlamaForCausalLM tokenizer = LlamaTokenizer.from_pretrained("maritaca-ai/sabia-7b") model = LlamaForCausalLM.from_pretrained( "maritaca-ai/sabia-7b", device_map="auto", low_cpu_mem_usage=True, torch_dtype=torch.bfloat16 )
-
Prepare Input: Create a text prompt for generation.
prompt = "Classifique a resenha de filme como 'positiva' ou 'negativa'.\n" input_ids = tokenizer(prompt, return_tensors="pt")
-
Generate Output:
output = model.generate( input_ids["input_ids"].to("cuda"), max_length=1024, eos_token_id=tokenizer.encode("\n") )
-
Cloud GPUs: Consider using cloud services like AWS, GCP, or Azure for access to GPUs.
For systems with limited GPU memory, consider using 8-bit loading by installing bitsandbytes
and setting load_in_8bit=True
during model loading.
License
Sabiá-7B is licensed similarly to LLaMA-1, restricting its use to research purposes only. For more details, refer to the licensing terms associated with LLaMA-1.