Falcon3 3 B Base

tiiuae

Introduction

Falcon3-3B-Base is part of the Falcon3 family of Open Foundation Models, developed by the Technology Innovation Institute. This family includes pretrained and instruct language models (LLMs) with sizes ranging from 1 billion to 10 billion parameters. Falcon3-3B-Base is particularly designed for tasks involving reasoning, language understanding, instruction following, as well as code and mathematics, supporting four languages: English, French, Spanish, and Portuguese.

Architecture

  • Model Type: Transformer-based causal decoder-only architecture.
  • Decoder Blocks: 22.
  • Attention Mechanism: Grouped Query Attention (GQA) with 12 query heads and 4 key-value heads.
  • Head Dimension: 256.
  • High RoPE Value: 1000042, allowing for long context understanding.
  • Normalization and Activation: Uses SwiGLU and RMSNorm.
  • Context Length: 8,000 tokens.
  • Vocabulary Size: 131,000 tokens.
  • The model was pruned from Falcon3-7B-Base, trained on 100 Gigatokens of diverse datasets using 1,024 H100 GPU chips.

Training

Falcon3-3B-Base was trained efficiently using a knowledge distillation objective. It is a raw, pretrained model and should be further fine-tuned using methods like Supervised Fine Tuning (SFT), Reinforcement Learning from Human Feedback (RLHF), or continued pretraining for optimal performance in specific use cases.

Guide: Running Locally

To run Falcon3-3B-Base locally, follow these steps:

  1. Install the Transformers Library: Ensure you have the transformers library installed in your Python environment.

  2. Set Up the Environment: Import the necessary libraries and set up a text-generation pipeline.

    import torch
    from transformers import pipeline
    
    pipe = pipeline(
        "text-generation", 
        model="tiiuae/Falcon3-3B-Base", 
        torch_dtype=torch.bfloat16, 
        device_map="auto"
    )
    response = pipe("Question: How many hours in one day? Answer: ")
    print(response[0]['generated_text'])
    
  3. Hardware Suggestions: Use cloud GPUs for better performance, such as NVIDIA's H100, which are suitable for handling large model inference.

License

Falcon3-3B-Base is released under the TII Falcon-LLM License 2.0. For detailed terms and conditions, refer to the license link.

More Related APIs in Text Generation