E E V E Korean Instruct 10.8 B v1.0

yanolja

Introduction

The EEVE-Korean-Instruct-10.8B-v1.0 model by Yanolja is a fine-tuned variant of the EEVE-Korean-10.8B-v1.0 model, designed for Korean language processing. It extends the vocabulary of the upstage/SOLAR-10.7B-v1.0 model and uses Direct Preference Optimization (DPO) techniques with Axolotl for enhanced performance.

Architecture

This model is built upon a large language model architecture aimed at improving multilingual capabilities. It features a vocabulary expansion tailored for Korean, leveraging advancements in the base model, the SOLAR-10.7B.

Training

The training data includes Korean-translated versions of Open-Orca/SlimOrca-Dedup and argilla/ultrafeedback-binarized-preferences-cleaned datasets. No other datasets were used. The model's training methodology is documented in the technical report titled "Efficient and Effective Vocabulary Expansion Towards Multilingual Large Language Models."

Guide: Running Locally

To run the model locally, follow these steps:

  1. Install Dependencies:

    pip install transformers torch
    
  2. Load the Model:

    from transformers import AutoTokenizer, AutoModelForCausalLM
    
    model = AutoModelForCausalLM.from_pretrained("yanolja/EEVE-Korean-Instruct-10.8B-v1.0")
    tokenizer = AutoTokenizer.from_pretrained("yanolja/EEVE-Korean-Instruct-10.8B-v1.0")
    
  3. Prepare Input and Generate Output:

    prompt_template = "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.\nHuman: {prompt}\nAssistant:\n"
    text = '한국의 수도는 어디인가요? 아래 선택지 중 골라주세요.\n\n(A) 경성\n(B) 부산\n(C) 평양\n(D) 서울\n(E) 전주'
    model_inputs = tokenizer(prompt_template.format(prompt=text), return_tensors='pt')
    outputs = model.generate(**model_inputs, max_new_tokens=256)
    output_text = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]
    print(output_text)
    
  4. Use Cloud GPUs: For better performance, consider using cloud services like AWS, GCP, or Azure to access powerful GPUs.

License

The EEVE-Korean-Instruct-10.8B-v1.0 model is licensed under the Apache 2.0 License. This allows for both personal and commercial use, modification, and distribution with proper attribution.

More Related APIs in Text Generation