Mobile L L M 125 M
facebookIntroduction
MobileLLM is a language model developed by Meta, designed for on-device applications with limited resources. It optimizes sub-billion parameter models for efficient performance on mobile devices, integrating techniques like the SwiGLU activation function and grouped-query attention. MobileLLM shows improved accuracy over previous state-of-the-art models in zero-shot commonsense reasoning tasks.
Architecture
MobileLLM employs an auto-regressive transformer architecture, tailored for on-device use. Key features include:
- SwiGLU Activation Function: Enhances model efficiency.
- Deep and Thin Architectures: Optimizes computational resource usage.
- Embedding Sharing and Grouped-Query Attention: Further boosts performance. The model's architecture varies across different sizes, from 125M to 1.5B parameters, with improvements in accuracy and efficiency at each scale.
Training
The models were trained using publicly available online data, with a context length of 2,000 tokens and support for shared embeddings. Training was conducted on 1 trillion tokens using 32 NVIDIA A100 80G GPUs, with training durations ranging from 3 days for the 125M model to 18 days for the 1.5B model.
Guide: Running Locally
Steps to Run
-
Using Hugging Face:
- Install the
transformers
library. - Load the model with:
from transformers import AutoModelForCausalLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("facebook/MobileLLM-125M", use_fast=False) model = AutoModelForCausalLM.from_pretrained("facebook/MobileLLM-125M", trust_remote_code=True)
- Add special tokens if needed:
tokenizer.add_special_tokens({ "eos_token": "</s>", "bos_token": "<s>", "unk_token": "<unk>", })
- Install the
-
Using MobileLLM Codebase:
- Clone the repository:
git clone https://github.com/facebookresearch/MobileLLM
- Install dependencies:
pip install -r requirement.txt
- Pre-process data and run pretraining:
bash pretrain.sh
- Evaluate with:
bash eval.sh
- Clone the repository:
Cloud GPU Recommendation
For optimal performance, it is recommended to use cloud services offering NVIDIA A100 GPUs, such as AWS, Google Cloud, or Azure, especially for training or fine-tuning large models.
License
MobileLLM is licensed under the Creative Commons Attribution-NonCommercial 4.0 International License (CC-BY-NC 4.0).