Falcon3 3 B Base
tiiuaeIntroduction
Falcon3-3B-Base is part of the Falcon3 family of Open Foundation Models, developed by the Technology Innovation Institute. This family includes pretrained and instruct language models (LLMs) with sizes ranging from 1 billion to 10 billion parameters. Falcon3-3B-Base is particularly designed for tasks involving reasoning, language understanding, instruction following, as well as code and mathematics, supporting four languages: English, French, Spanish, and Portuguese.
Architecture
- Model Type: Transformer-based causal decoder-only architecture.
- Decoder Blocks: 22.
- Attention Mechanism: Grouped Query Attention (GQA) with 12 query heads and 4 key-value heads.
- Head Dimension: 256.
- High RoPE Value: 1000042, allowing for long context understanding.
- Normalization and Activation: Uses SwiGLU and RMSNorm.
- Context Length: 8,000 tokens.
- Vocabulary Size: 131,000 tokens.
- The model was pruned from Falcon3-7B-Base, trained on 100 Gigatokens of diverse datasets using 1,024 H100 GPU chips.
Training
Falcon3-3B-Base was trained efficiently using a knowledge distillation objective. It is a raw, pretrained model and should be further fine-tuned using methods like Supervised Fine Tuning (SFT), Reinforcement Learning from Human Feedback (RLHF), or continued pretraining for optimal performance in specific use cases.
Guide: Running Locally
To run Falcon3-3B-Base locally, follow these steps:
-
Install the Transformers Library: Ensure you have the
transformers
library installed in your Python environment. -
Set Up the Environment: Import the necessary libraries and set up a text-generation pipeline.
import torch from transformers import pipeline pipe = pipeline( "text-generation", model="tiiuae/Falcon3-3B-Base", torch_dtype=torch.bfloat16, device_map="auto" ) response = pipe("Question: How many hours in one day? Answer: ") print(response[0]['generated_text'])
-
Hardware Suggestions: Use cloud GPUs for better performance, such as NVIDIA's H100, which are suitable for handling large model inference.
License
Falcon3-3B-Base is released under the TII Falcon-LLM License 2.0. For detailed terms and conditions, refer to the license link.