Smol L M2 Co T 360 M
prithivMLmodsIntroduction
SmolLM2 is a family of compact language models designed to solve various tasks efficiently. Available in sizes of 135M, 360M, and 1.7B parameters, these models are lightweight enough to run on-device. The SmolLM2-CoT-360M model is particularly aimed at text generation and reasoning tasks.
Architecture
The SmolLM2-CoT-360M model utilizes a causal language model architecture optimized for text generation and reasoning tasks. It integrates transformers, safetensors, and other computational tools to achieve efficient performance.
Training
Fine-tuning SmolLM involves several key steps:
- Setting Up the Environment: Install necessary libraries such as
transformers
,datasets
,trl
,torch
, andwandb
. - Loading Pre-trained Models: Use
AutoModelForCausalLM
andAutoTokenizer
to load the model and tokenizer. - Preparing the Dataset: Load and tokenize the dataset, such as
Deepthink-Reasoning
. - Configuring Training Arguments: Set up parameters like batch size, learning rate, and device settings.
- Training: Use
SFTTrainer
to fine-tune the model. - Saving the Model: Save the fine-tuned model and tokenizer for future use.
Guide: Running Locally
To run the SmolLM2-CoT-360M model locally, follow these steps:
-
Install Libraries:
pip install transformers datasets trl torch accelerate bitsandbytes wandb
-
Load the Model:
from transformers import AutoModelForCausalLM, AutoTokenizer model_name = "prithivMLmods/SmolLM2-CoT-360M" model = AutoModelForCausalLM.from_pretrained(model_name) tokenizer = AutoTokenizer.from_pretrained(model_name)
-
Set Up Environment: Detect and utilize available hardware such as GPU.
import torch device = "cuda" if torch.cuda.is_available() else "cpu" model.to(device)
-
Run Inference:
messages = [{"role": "user", "content": "What is the capital of France."}] input_text = tokenizer.apply_chat_template(messages, tokenize=False) inputs = tokenizer.encode(input_text, return_tensors="pt").to(device) outputs = model.generate(inputs, max_new_tokens=50) print(tokenizer.decode(outputs[0]))
Cloud GPUs
For more efficient training and inference, consider using cloud-based GPU resources such as AWS, Google Cloud, or Azure.
License
The SmolLM2-CoT-360M model is licensed under the Apache 2.0 License, allowing for broad use and modification with attribution.