Arcee Vy Linh
arcee-aiIntroduction
Arcee-VyLinh is a 3 billion parameter model designed for instruction-following, optimized for Vietnamese language understanding and generation. It utilizes an innovative training approach with evolved questions and Direct Preference Optimization, providing strong performance despite its compact size.
Architecture
- Base Model: Qwen2.5-3B
- Parameters: 3 billion
- Context Length: 32K tokens
- Supported Languages: English and Vietnamese, with optimization for Vietnamese
- Library: Transformers
Training
Arcee-VyLinh's training process involves several stages:
- Base Model Selection: Utilized Qwen2.5-3B.
- Hard Question Evolution: Generated 20K challenging questions using EvolKit.
- Initial Training: Developed VyLinh-SFT with supervised fine-tuning.
- Model Merging: Proprietary technique with Qwen2.5-3B-Instruct.
- DPO Training: Conducted 6 epochs of iterative DPO using ORPO-Mix-40K.
- Final Merge: Combined with Qwen2.5-3B-Instruct for enhanced performance.
Guide: Running Locally
To run Arcee-VyLinh locally, follow these steps:
-
Install Transformers Library
pip install transformers
-
Load the Model and Tokenizer
from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained("arcee-ai/Arcee-VyLinh") tokenizer = AutoTokenizer.from_pretrained("arcee-ai/Arcee-VyLinh")
-
Prepare Input and Generate Text
prompt = "Một cộng một bằng mấy?" messages = [ {"role": "system", "content": "Bạn là trợ lí hữu ích."}, {"role": "user", "content": prompt} ] text = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) model_inputs = tokenizer([text], return_tensors="pt").to(device) generated_ids = model.generate( model_inputs.input_ids, max_new_tokens=1024, eos_token_id=tokenizer.eos_token_id, temperature=0.25, ) response = tokenizer.batch_decode(generated_ids)[0] print(response)
-
Cloud GPUs: For optimal performance, consider using cloud-based GPU resources such as AWS or Google Cloud.
License
The specific license for Arcee-VyLinh is not mentioned in the provided documentation. Users should check the model's repository on Hugging Face for licensing details.