mbart ja en
ken11Introduction
MBART-JA-EN is a translation model fine-tuned from Facebook's mBART-large-cc25. It is specifically designed for Japanese to English translation using the JESC dataset.
Architecture
The model is based on the mBART (Multilingual BART) architecture, which supports text-to-text generation tasks. It utilizes a transformer architecture and is implemented in PyTorch, allowing it to handle multilingual tasks effectively.
Training
The model was fine-tuned using the JESC dataset, a large collection of Japanese-English sentence pairs. The tokenizer employed is based on SentencePiece, trained specifically on the JESC dataset.
Guide: Running Locally
To run the MBART-JA-EN model locally, follow these steps:
- Install Transformers Library: Ensure you have the Hugging Face Transformers library installed.
pip install transformers
- Load the Model and Tokenizer:
from transformers import MBartForConditionalGeneration, MBartTokenizer tokenizer = MBartTokenizer.from_pretrained("ken11/mbart-ja-en") model = MBartForConditionalGeneration.from_pretrained("ken11/mbart-ja-en")
- Input and Generate Translation:
inputs = tokenizer("こんにちは", return_tensors="pt") translated_tokens = model.generate(**inputs, decoder_start_token_id=tokenizer.lang_code_to_id["en_XX"], early_stopping=True, max_length=48) pred = tokenizer.batch_decode(translated_tokens, skip_special_tokens=True)[0] print(pred)
For optimal performance, especially with larger datasets, consider using cloud-based solutions like AWS EC2 with GPU instances or Google Cloud Platform's GPUs.
License
The model is distributed under the MIT License, allowing for flexible use and distribution.