W N V N 14 B v0.1
SakuraLLMWN-VN-14B-V0.1
Introduction
WN-VN-14B-V0.1 is a language model that facilitates bilingual translation between Chinese and Vietnamese. It is based on the Qwen2.5-14B-Instruct model and has been fine-tuned using official Vietnamese translations of five Chinese web novels, including titles like "惊悚乐园" (Thriller Paradise) and "恐怖屋" (Haunted House).
Architecture
The model utilizes the AutoModelForCausalLM architecture from the transformers
library. It employs a tokenizer for preprocessing and a generation configuration to manage the output generation process. The model is designed for causal language modeling tasks, specifically in the context of machine translation.
Training
The model was fine-tuned on Vietnamese translations of five Chinese web novels. This approach ensures that the model is optimized for translation tasks between these two languages, leveraging the narrative style and contextual nuances present in these literary works.
Guide: Running Locally
To run the WN-VN-14B-V0.1 model locally, follow these steps:
-
Install the Transformers Library:
pip install transformers
-
Load the Model and Tokenizer:
from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig model_path = 'CjangCjengh/WN-VN-14B-v0.1' tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained(model_path, device_map='auto', trust_remote_code=True).eval() model.generation_config = GenerationConfig.from_pretrained(model_path, trust_remote_code=True)
-
Prepare the Input Text: Ensure the text length is within the model's constraints (e.g., 1024 characters for Chinese to Vietnamese translation).
-
Generate Translations:
text = 'Your text here' model_inputs = tokenizer([text], return_tensors='pt').to('cuda') generated_ids = model.generate(model_inputs.input_ids, max_new_tokens=1024) response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0] print(response)
-
Cloud GPU Suggestions: Consider using cloud services like AWS, GCP, or Azure for access to high-performance GPUs, which can significantly enhance the model's translation speed and efficiency.
License
The WN-VN-14B-V0.1 model is released under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (cc-by-nc-sa-4.0). This license allows for sharing and adaptation of the model for non-commercial purposes, provided that appropriate credit is given and adaptations are shared under the same terms.