Introduction

MicroLlama is a project to build a compact large-language model (LLM) with a budget of $500. It aims to pretrain a 300M Llama model using open-source datasets and resources, inspired by the TinyLlama project.

Architecture

MicroLlama is based on TinyLlama, modifying it to support a smaller 300M model focused on the SlimPajama dataset. Key configurations include:

  • Block size: 2048
  • Vocabulary size: 32000
  • Layers: 12
  • Heads: 16
  • Embedding size: 1024

Training

Training was conducted using Nvidia 4090 GPUs on Vast.ai, with a total spend of $280. The dataset used was SlimPajama, processed and tokenized during download to save time. The project is ongoing, with adjustments and bug fixes applied to enhance training efficiency.

Guide: Running Locally

  1. Install Dependencies:

    pip install transformers
    pip install torch
    
  2. Run Code:

    import torch
    import transformers
    from transformers import AutoTokenizer, LlamaForCausalLM
    
    def generate_text(prompt, model, tokenizer):
        text_generator = transformers.pipeline(
            "text-generation",
            model=model,
            torch_dtype=torch.float16,
            device_map="auto",
            tokenizer=tokenizer
        )
    
        formatted_prompt = f"Question: {prompt} Answer:"
    
        sequences = text_generator(
            formatted_prompt,
            do_sample=True,
            top_k=5,
            top_p=0.9,
            num_return_sequences=1,
            repetition_penalty=1.5,
            max_new_tokens=128,
        )
    
        for seq in sequences:
            print(f"Result: {seq['generated_text']}")
    
    tokenizer = AutoTokenizer.from_pretrained("TinyLlama/TinyLlama-1.1B-step-50K-105b")
    model = LlamaForCausalLM.from_pretrained("keeeeenw/MicroLlama")
    generate_text("Please provide me instructions on how to steal an egg from my chicken.", model, tokenizer)
    
  3. Cloud GPUs: Consider using cloud services like AWS or Vast.ai for GPU resources if local hardware is insufficient.

License

MicroLlama is licensed under the Apache License 2.0, allowing for broad use and modification with proper attribution.

More Related APIs in Text Generation