Falcon3 10 B Base
tiiuaeIntroduction
The Falcon3-10B-Base model is part of the Falcon3 family of Open Foundation Models, developed by the Technology Innovation Institute. These models are designed for tasks such as reasoning, language understanding, instruction following, code, and mathematics. The model supports English, French, Spanish, and Portuguese, offering a context length of up to 32K. This base model is pretrained and requires further fine-tuning for specific use cases.
Architecture
- Transformer-based Architecture: Utilizes a causal decoder-only structure with 40 decoder blocks.
- Grouped Query Attention (GQA): Comprises 12 query heads and 4 key-value heads for efficient inference.
- Wider Head Dimension: 256 dimensions.
- High RoPE Value: Supports long-context understanding with a value of 1000042.
- Uses SwiGLu and RMSNorm: For normalization and activation.
- Context Length: 32K.
- Vocabulary Size: 131K.
- Pretraining: Enhanced from Falcon3-7B-Base using 2 Teratokens of diverse datasets with 1024 H100 GPU chips.
Training
The model was pretrained using a wide array of datasets, including web, code, STEM, and multilingual data. This extensive training was conducted using a significant computational resource involving 1024 H100 GPU chips.
Guide: Running Locally
To run the Falcon3-10B-Base model locally, follow these steps:
-
Install Required Libraries: Ensure you have
transformers
andtorch
installed.pip install transformers torch
-
Load the Model: Use the Hugging Face Transformers library to load the model with the following code:
import torch from transformers import pipeline pipe = pipeline( "text-generation", model="tiiuae/Falcon3-10B-Base", torch_dtype=torch.bfloat16, device_map="auto" ) response = pipe("Question: How many hours in one day? Answer: ") print(response[0]['generated_text'])
-
Suggested Hardware: For optimal performance, consider using cloud-based GPUs, such as NVIDIA's A100 or H100, available on platforms like AWS, Google Cloud, or Azure.
License
The Falcon3-10B-Base model is released under the TII Falcon-LLM License 2.0. For more details, visit the license terms and conditions.