Smol L M2 Co T 360 M G G U F

prithivMLmods

Introduction

SmolLM2 is a family of compact language models designed for efficiency and versatility, with parameter sizes of 135M, 360M, and 1.7B. These models are suitable for solving various tasks and are lightweight enough for on-device execution.

Architecture

The SmolLM2 models leverage a compact transformer architecture optimized for text generation and reasoning tasks. The specific variant discussed here is the SmolLM2-CoT-360M, which incorporates techniques for chain-of-thought reasoning.

Training

Training SmolLM2 models involves a structured process:

  1. Environment Setup: Install necessary Python libraries such as transformers, datasets, trl, torch, accelerate, bitsandbytes, and wandb.
  2. Loading Pre-trained Models and Tokenizers: Use Hugging Face's AutoModelForCausalLM and AutoTokenizer.
  3. Dataset Preparation: Load and tokenize the Deepthink-Reasoning dataset using Hugging Face datasets.
  4. Training Configuration: Define parameters like batch size and learning rate using TrainingArguments.
  5. Model Training: Utilize SFTTrainer to fine-tune the model.
  6. Model Saving: Save the fine-tuned model locally for future use.

Guide: Running Locally

  1. Install Required Libraries: Use pip to install the following packages:
    pip install transformers datasets trl torch accelerate bitsandbytes wandb
    
  2. Load Model and Tokenizer:
    from transformers import AutoModelForCausalLM, AutoTokenizer
    model = AutoModelForCausalLM.from_pretrained("prithivMLmods/SmolLM2-CoT-360M")
    tokenizer = AutoTokenizer.from_pretrained("prithivMLmods/SmolLM2-CoT-360M")
    
  3. Run Inference:
    device = "cuda" if torch.cuda.is_available() else "cpu"
    model.to(device)
    input_text = "What is the capital of France."
    inputs = tokenizer.encode(input_text, return_tensors="pt").to(device)
    outputs = model.generate(inputs, max_new_tokens=50)
    print(tokenizer.decode(outputs[0]))
    
  4. Consider Cloud GPUs: For optimal performance, use cloud platforms like AWS, Google Cloud, or Azure, which offer GPU resources.

License

The SmolLM2-CoT-360M-GGUF model is licensed under the Apache-2.0 License, allowing for use, modification, and distribution with proper attribution and without warranty.

More Related APIs in Text Generation