WENYANWEN-CHINESE-TRANSLATE-TO-ANCIENT

Introduction

The WENYANWEN-CHINESE-TRANSLATE-TO-ANCIENT model translates modern Chinese text into Classical Chinese. It is part of a broader effort to provide translation models between modern and ancient Chinese.

Architecture

This model is an encoder-decoder architecture implemented using the Transformers library and PyTorch. It is specifically designed for text-to-text generation tasks, focusing on translating modern Chinese into 文言文 (Classical Chinese).

Training

The model was trained on a dataset consisting of over 900,000 sentence pairs. More details about the dataset can be found on the linked GitHub repository.

Guide: Running Locally

To run the model locally, follow these steps:

Install Dependencies: Install the Hugging Face Transformers library and PyTorch.
```
pip install transformers torch
```

Load the Model: Use the following Python code to load and run the model.

from transformers import EncoderDecoderModel, AutoTokenizer
import torch

PRETRAINED = "raynardj/wenyanwen-chinese-translate-to-ancient"
tokenizer = AutoTokenizer.from_pretrained(PRETRAINED)
model = EncoderDecoderModel.from_pretrained(PRETRAINED)

def inference(text):
    tk_kwargs = dict(
        truncation=True,
        max_length=128,
        padding="max_length",
        return_tensors='pt')
    
    inputs = tokenizer([text,], **tk_kwargs)
    with torch.no_grad():
        return tokenizer.batch_decode(
            model.generate(
                inputs.input_ids,
                attention_mask=inputs.attention_mask,
                num_beams=3,
                bos_token_id=101,
                eos_token_id=tokenizer.sep_token_id,
                pad_token_id=tokenizer.pad_token_id,
            ), skip_special_tokens=True)

Run Inference: Call the inference function with a modern Chinese sentence to get the Classical Chinese translation.

result = inference("你连一百块都不肯给我")
print(result)  # Output: ['不 肯 与 我 百 钱 。']

For optimal performance, consider using cloud-based GPUs such as those available on AWS, Google Cloud, or Azure to handle model inference.

License

The model is available under the Apache-2.0 license, allowing for both personal and commercial use, modifications, and distribution.