llama 3.2 Korean Bllossom 3 B

Bllossom

Introduction

The Bllossom-3B model is a bilingual large language model designed to enhance the capabilities of the Meta-Llama-3.2-3B model by incorporating Korean language support. This model is developed by the Bllossom team to address the absence of Korean language capabilities in the original model, while maintaining the English language performance.

Architecture

Bllossom-3B is based on the Meta-Llama-3.2-3B architecture, supporting both English and Korean languages. It utilizes the transformers library for text generation tasks and is compatible with safe tensors for secure model deployments.

Training

The model underwent full-tuning with 150GB of refined Korean data, ensuring high-quality instruction tuning without compromising English performance. The training process focused on creating a fully bilingual model, avoiding specific benchmark targeting to maintain broader applicability.

Guide: Running Locally

  1. Install Dependencies: Ensure you have torch and transformers installed in your Python environment.
    pip install torch transformers
    
  2. Load the Model: Use the following code to load the tokenizer and model.
    import torch
    from transformers import AutoTokenizer, AutoModelForCausalLM
    
    model_id = 'Bllossom/llama-3.2-Korean-Bllossom-3B'
    
    tokenizer = AutoTokenizer.from_pretrained(model_id)
    model = AutoModelForCausalLM.from_pretrained(
        model_id,
        torch_dtype=torch.bfloat16,
        device_map="auto",
    )
    
  3. Inference: Input a sample instruction and generate text.
    instruction = "철수가 20개의 연필을 가지고 있었는데..."
    messages = [{"role": "user", "content": f"{instruction}"}]
    input_ids = tokenizer.apply_chat_template(
        messages,
        add_generation_prompt=True,
        return_tensors="pt"
    ).to(model.device)
    
    outputs = model.generate(
        input_ids,
        max_new_tokens=1024,
        eos_token_id=[tokenizer.convert_tokens_to_ids("<|end_of_text|>")],
        do_sample=True,
        temperature=0.6,
        top_p=0.9
    )
    
    print(tokenizer.decode(outputs[0][input_ids.shape[-1]:], skip_special_tokens=True))
    

Cloud GPUs: For optimal performance, especially with large models, consider using cloud GPU services like AWS EC2, Google Cloud Platform, or Azure.

License

The Bllossom-3B model is released under the llama3.2 license, permitting commercial use and distribution.

More Related APIs in Text Generation