Smol L M2 1.7 B

HuggingFaceTB

Introduction

SmolLM2 is a compact family of language models available in sizes of 135M, 360M, and 1.7B parameters. Designed for efficiency, they can perform a wide range of tasks and run on-device. The 1.7B variant shows improvements in instruction following, knowledge, reasoning, and mathematics over its predecessor, SmolLM1-1.7B. It was trained on 11 trillion tokens using a diverse dataset, with enhancements for instruction tasks through supervised fine-tuning and Direct Preference Optimization.

Architecture

SmolLM2 employs a Transformer decoder architecture and was pretrained using 11 trillion tokens. The model training was conducted with 256 H100 GPUs and leveraged the nanotron training framework. The model supports bfloat16 precision for efficient computation.

Training

SmolLM2 was developed using a combination of public and curated datasets, including FineWeb-Edu, DCLM, The Stack, and new datasets for mathematics and coding. The instruct version of the model was fine-tuned using curated datasets and Direct Preference Optimization using UltraFeedback.

Guide: Running Locally

To run the SmolLM2-1.7B model locally:

  1. Install the Transformers Library:

    pip install transformers
    
  2. Basic Script for Running the Model:

    from transformers import AutoModelForCausalLM, AutoTokenizer
    
    checkpoint = "HuggingFaceTB/SmolLM2-1.7B"
    device = "cuda"  # Use "cuda" for GPU or "cpu" for CPU
    
    tokenizer = AutoTokenizer.from_pretrained(checkpoint)
    model = AutoModelForCausalLM.from_pretrained(checkpoint).to(device)
    
    inputs = tokenizer.encode("Gravity is", return_tensors="pt").to(device)
    outputs = model.generate(inputs)
    
    print(tokenizer.decode(outputs[0]))
    
  3. For Using Multiple GPUs and Different Precision:

    • Install the accelerate library.
    • Use torch_dtype=torch.bfloat16 or torch_dtype=torch.float16 for reduced precision.
  4. Suggested Cloud GPUs:

    • Consider using cloud providers like AWS, GCP, or Azure for access to powerful GPUs like NVIDIA V100 or A100.

License

The SmolLM2 model is released under the Apache 2.0 License, which allows for flexibility in use and distribution. For more information, see Apache 2.0 License.

More Related APIs in Text Generation