Micro Thinker 1 B Preview
huihui-aiIntroduction
MicroThinker-1B-Preview is a model fine-tuned from the huihui-ai/Llama-3.2-1B-Instruct-abliterated model, aimed at enhancing AI reasoning capabilities. This version is focused on text generation and uses the Transformers library.
Architecture
The model architecture is based on the Llama-3.2 series, specifically the Llama-3.2-1B-Instruct-abliterated variant. It is designed to be uncensored, conversational, and suitable for text-generation tasks, leveraging the advanced features of the Transformers library.
Training
MicroThinker-1B-Preview was trained in a test environment using a single RTX 4090 GPU with 24GB memory. The fine-tuning process employed the SFT (Supervised Fine-Tuning) framework, utilizing 20,000 records from two datasets: QWQ-LONGCOT-500K and LONGCOT-Refine-500K. Training was conducted over one epoch, focusing on specific parameters like learning rate, batch size, and gradient accumulation.
Guide: Running Locally
-
Setup Environment:
- Create a conda environment:
conda create -yn ms-swift python=3.11 conda activate ms-swift
- Clone and install the
ms-swift
repository:git clone https://github.com/modelscope/ms-swift.git cd ms-swift pip install -e . cd ..
- Create a conda environment:
-
Download Model and Datasets:
- Use the
huggingface-cli
to download the model and datasets:huggingface-cli download huihui-ai/Llama-3.2-1B-Instruct-abliterated --local-dir ./huihui-ai/Llama-3.2-1B-Instruct-abliterated huggingface-cli download --repo-type dataset huihui-ai/QWQ-LONGCOT-500K --local-dir ./data/QWQ-LONGCOT-500K huggingface-cli download --repo-type dataset huihui-ai/LONGCOT-Refine-500K --local-dir ./data/LONGCOT-Refine-500K
- Use the
-
Fine-Tuning and Inference:
- Follow the provided commands for fine-tuning and inference using the
swift
tool. Consider using cloud GPUs such as AWS EC2 P3 instances or Google Cloud's A100 GPUs for optimal performance.
- Follow the provided commands for fine-tuning and inference using the
-
Inference:
- Perform inference with:
swift infer --model huihui/MicroThinker-1B-Preview --stream true --infer_backend pt --max_new_tokens 8192
- Perform inference with:
License
MicroThinker-1B-Preview is released under the Apache-2.0 License, allowing for open use and modification under its terms.