Introduction

YuLan-Mini is a data-efficient language model developed by the AI Box team at Renmin University of China. It features 2.4 billion parameters and is designed to perform well in mathematics and code domains, using only 1.08 trillion tokens for pre-training. The model aims to offer comparable performance to larger models with significantly more data.

Architecture

YuLan-Mini incorporates a novel pre-training methodology that enhances training efficiency through:

  1. A sophisticated data pipeline for data cleaning and scheduling.
  2. Systematic optimization to reduce training instability.
  3. Targeted data selection and long context training for efficient annealing.

Training

YuLan-Mini was pre-trained on a diverse set of datasets, including those focused on mathematics and programming. It supports text-generation tasks and has demonstrated strong performance across various benchmarks such as HumanEval, GSM8K, and MATH-500.

Guide: Running Locally

To run YuLan-Mini locally, follow these steps:

  1. Install Dependencies: Ensure you have Python installed with necessary libraries like torch and transformers.

  2. Load Model and Tokenizer:

    import torch
    from transformers import AutoTokenizer, AutoModelForCausalLM
    
    tokenizer = AutoTokenizer.from_pretrained("yulan-team/YuLan-Mini")
    model = AutoModelForCausalLM.from_pretrained("yulan-team/YuLan-Mini", torch_dtype=torch.bfloat16)
    
  3. Perform Inference:

    input_text = "Renmin University of China is"
    inputs = tokenizer(input_text, return_tensors="pt")
    output = model.generate(inputs["input_ids"], max_new_tokens=100)
    print(tokenizer.decode(output[0], skip_special_tokens=True))
    
  4. Serve Model:

    • Using vLLM: vllm serve yulan-team/YuLan-Mini --dtype bfloat16
    • Using SGLang: python -m sglang.launch_server --model-path yulan-team/YuLan-Mini --port 30000 --host 0.0.0.0

For optimal performance, consider using cloud GPUs such as those offered by AWS, Google Cloud, or Azure.

License

YuLan-Mini is released under the MIT License. Policies concerning the model weights and data usage will be updated in future releases. Although efforts have been made to ensure safety and ethical generation, users should be aware of potential biases and harmful content that could arise from model outputs.

More Related APIs in Text Generation