W N V N 14 B v0.1

SakuraLLM

WN-VN-14B-V0.1

Introduction

WN-VN-14B-V0.1 is a language model that facilitates bilingual translation between Chinese and Vietnamese. It is based on the Qwen2.5-14B-Instruct model and has been fine-tuned using official Vietnamese translations of five Chinese web novels, including titles like "惊悚乐园" (Thriller Paradise) and "恐怖屋" (Haunted House).

Architecture

The model utilizes the AutoModelForCausalLM architecture from the transformers library. It employs a tokenizer for preprocessing and a generation configuration to manage the output generation process. The model is designed for causal language modeling tasks, specifically in the context of machine translation.

Training

The model was fine-tuned on Vietnamese translations of five Chinese web novels. This approach ensures that the model is optimized for translation tasks between these two languages, leveraging the narrative style and contextual nuances present in these literary works.

Guide: Running Locally

To run the WN-VN-14B-V0.1 model locally, follow these steps:

  1. Install the Transformers Library:

    pip install transformers
    
  2. Load the Model and Tokenizer:

    from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig
    
    model_path = 'CjangCjengh/WN-VN-14B-v0.1'
    tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
    model = AutoModelForCausalLM.from_pretrained(model_path, device_map='auto', trust_remote_code=True).eval()
    model.generation_config = GenerationConfig.from_pretrained(model_path, trust_remote_code=True)
    
  3. Prepare the Input Text: Ensure the text length is within the model's constraints (e.g., 1024 characters for Chinese to Vietnamese translation).

  4. Generate Translations:

    text = 'Your text here'
    model_inputs = tokenizer([text], return_tensors='pt').to('cuda')
    generated_ids = model.generate(model_inputs.input_ids, max_new_tokens=1024)
    response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
    print(response)
    
  5. Cloud GPU Suggestions: Consider using cloud services like AWS, GCP, or Azure for access to high-performance GPUs, which can significantly enhance the model's translation speed and efficiency.

License

The WN-VN-14B-V0.1 model is released under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (cc-by-nc-sa-4.0). This license allows for sharing and adaptation of the model for non-commercial purposes, provided that appropriate credit is given and adaptations are shared under the same terms.

More Related APIs