mbart large 50 mmt ko vi
ofu-aiIntroduction
The MBART-LARGE-50-MMT-KO-VI model is a fine-tuned version of mBART-large-50, designed for translating Korean legal documents into Vietnamese. It was developed by Jaeyoon Myoung and Heewon Kwak and is shared by OFU. The model is licensed under MIT.
Architecture
The model is based on the mBART-large-50 architecture, a multilingual sequence-to-sequence transformer with 24 encoder and decoder layers and 1,024 hidden units. It is optimized for translation tasks.
Training
The model was trained using a Korean legal document dataset from AI Hub. Key training parameters include:
- Learning rate: 0.0001
- Train batch size: 8
- Evaluation batch size: 8
- Training time: 1 hour 25 minutes on an NVIDIA RTX 4090
- BLEU score: 29.69
Preprocessing involved cleaning the text by removing unnecessary whitespace and special characters.
Guide: Running Locally
- Clone the Repository: Download the model repository from Hugging Face.
- Install Dependencies: Ensure Python 3.11.9 and PyTorch 2.4.0 are installed, along with the Hugging Face Transformers library.
- Set Up Environment: Use an NVIDIA GPU for optimal performance, such as the RTX 4090. Consider cloud GPU services like AWS or Google Cloud for scalability.
- Run Inference: Use the Transformers library to load the model and perform translations.
License
The model is licensed under the MIT License, allowing for broad use and modification.