Falcon3 7 B Base
tiiuaeFalcon3-7B-Base
Introduction
Falcon3-7B-Base is part of the Falcon3 family of Open Foundation Models, consisting of pretrained and instructed large language models (LLMs) ranging from 1 billion to 10 billion parameters. The model is designed to deliver state-of-the-art results in reasoning, language understanding, instruction following, and coding tasks. It supports English, French, Spanish, and Portuguese with a context length of up to 32,000 tokens. This is a raw, pretrained model and requires finetuning for most use cases.
Architecture
- Type: Transformer-based causal decoder-only architecture
- Blocks: 28 decoder blocks
- Attention: Grouped query attention (GQA) with 12 query heads and 4 key-value (KV) heads
- Head Dimension: 256 (wider)
- RoPE Value: High RoPE value of 1000042 for long context understanding
- Context Length: 32,000 tokens
- Vocabulary Size: 131,000 tokens
- Training: Pretrained on 14 Teratokens of diverse datasets using 1024 H100 GPU chips
- Languages Supported: English, French, Spanish, Portuguese
Training
The model was pretrained on a massive dataset of 14 Teratokens, comprising web data, code, STEM, high-quality, and multilingual data. It utilizes 1024 H100 GPU chips for training.
Guide: Running Locally
To run Falcon3-7B-Base locally:
-
Setup: Install PyTorch and the
transformers
library. -
Code: Use the following code snippet to initiate the model:
import torch from transformers import pipeline pipe = pipeline( "text-generation", model="tiiuae/Falcon3-7B-Base", torch_dtype=torch.bfloat16, device_map="auto" ) response = pipe("Question: How many hours in one day? Answer: ") print(response[0]['generated_text'])
-
Hardware Recommendation: For optimal performance, consider using cloud GPUs such as NVIDIA A100 or H100.
License
Falcon3-7B-Base is licensed under the TII Falcon-LLM License 2.0. For more details, visit Falcon LLM Terms and Conditions.