Arcee Vy Linh

arcee-ai

Introduction

Arcee-VyLinh is a 3 billion parameter model designed for instruction-following, optimized for Vietnamese language understanding and generation. It utilizes an innovative training approach with evolved questions and Direct Preference Optimization, providing strong performance despite its compact size.

Architecture

  • Base Model: Qwen2.5-3B
  • Parameters: 3 billion
  • Context Length: 32K tokens
  • Supported Languages: English and Vietnamese, with optimization for Vietnamese
  • Library: Transformers

Training

Arcee-VyLinh's training process involves several stages:

  1. Base Model Selection: Utilized Qwen2.5-3B.
  2. Hard Question Evolution: Generated 20K challenging questions using EvolKit.
  3. Initial Training: Developed VyLinh-SFT with supervised fine-tuning.
  4. Model Merging: Proprietary technique with Qwen2.5-3B-Instruct.
  5. DPO Training: Conducted 6 epochs of iterative DPO using ORPO-Mix-40K.
  6. Final Merge: Combined with Qwen2.5-3B-Instruct for enhanced performance.

Guide: Running Locally

To run Arcee-VyLinh locally, follow these steps:

  1. Install Transformers Library

    pip install transformers
    
  2. Load the Model and Tokenizer

    from transformers import AutoModelForCausalLM, AutoTokenizer
    
    model = AutoModelForCausalLM.from_pretrained("arcee-ai/Arcee-VyLinh")
    tokenizer = AutoTokenizer.from_pretrained("arcee-ai/Arcee-VyLinh")
    
  3. Prepare Input and Generate Text

    prompt = "Một cộng một bằng mấy?"
    messages = [
        {"role": "system", "content": "Bạn là trợ lí hữu ích."},
        {"role": "user", "content": prompt}
    ]
    text = tokenizer.apply_chat_template(
        messages,
        tokenize=False,
        add_generation_prompt=True
    )
    model_inputs = tokenizer([text], return_tensors="pt").to(device)
    
    generated_ids = model.generate(
        model_inputs.input_ids,
        max_new_tokens=1024,
        eos_token_id=tokenizer.eos_token_id,
        temperature=0.25,
    )
    
    response = tokenizer.batch_decode(generated_ids)[0]
    print(response)
    
  4. Cloud GPUs: For optimal performance, consider using cloud-based GPU resources such as AWS or Google Cloud.

License

The specific license for Arcee-VyLinh is not mentioned in the provided documentation. Users should check the model's repository on Hugging Face for licensing details.

More Related APIs in Text Generation