wenyanwen ancient translate to modern LLM Model

Introduction

This project provides a model for translating Classical (Ancient) Chinese into Modern Chinese. It is available on Hugging Face and has been implemented into an application that allows for reading and translating ancient texts.

Architecture

The model is based on the Encoder-Decoder architecture and is implemented using the Transformers library in PyTorch. It is designed for text-to-text generation tasks, specifically focusing on translating ancient Chinese texts to modern Chinese.

Training

The model was trained on a dataset containing over 900,000 sentence pairs. During training, the source sequences (ancient texts) had all punctuation removed with a 50% probability. The dataset used for training can be found on GitHub.

Guide: Running Locally

To run the model locally, follow these steps:

Install the Transformers library: Ensure you have the transformers library installed in your Python environment.

Import necessary modules:

from transformers import EncoderDecoderModel, AutoTokenizer

Load the pretrained model and tokenizer:

PRETRAINED = "raynardj/wenyanwen-ancient-translate-to-modern"
tokenizer = AutoTokenizer.from_pretrained(PRETRAINED)
model = EncoderDecoderModel.from_pretrained(PRETRAINED)

Define the inference function:

def inference(text):
    tk_kwargs = dict(
      truncation=True,
      max_length=128,
      padding="max_length",
      return_tensors='pt')

    inputs = tokenizer([text,], **tk_kwargs)
    with torch.no_grad():
        return tokenizer.batch_decode(
            model.generate(
                inputs.input_ids,
                attention_mask=inputs.attention_mask,
                num_beams=3,
                max_length=256,
                bos_token_id=101,
                eos_token_id=tokenizer.sep_token_id,
                pad_token_id=tokenizer.pad_token_id,
            ), skip_special_tokens=True)

Run inference: Use the inference function on your text input to get the modern Chinese translation.
Cloud GPUs: For improved performance, consider using cloud GPU services such as AWS, Google Cloud, or Azure.

License

The content and code for this model are available under the terms specified on the Hugging Face platform. Users should refer to the specific license details provided in the repository or on the Hugging Face model card page for compliance.

More Related APIs in Translation