nllb 200 3.3 B
facebookIntroduction
NLLB-200-3.3B is a machine translation model developed by Meta, designed to translate single sentences across 200 languages. It is specifically intended for research purposes, with a focus on low-resource languages. This model is not recommended for production deployment or domain-specific texts such as medical or legal documents.
Architecture
The model leverages parallel multilingual data and monolingual data from Common Crawl for training. It uses a SentencePiece model for preprocessing, which is included with the release of NLLB-200.
Training
NLLB-200 was trained using a variety of data sources to address data imbalances in high and low-resource languages. The training methodology emphasizes fairness constraints and is detailed in the paper "No Language Left Behind: Scaling Human-Centered Machine Translation" by the NLLB Team et al. The training data includes both parallel multilingual datasets and monolingual datasets constructed from Common Crawl.
Guide: Running Locally
- Setup: Ensure you have Python and PyTorch installed. Clone the repository from the Fairseq GitHub.
- Dependencies: Install necessary dependencies using pip.
- Download Model: Download the NLLB-200-3.3B model from Hugging Face.
- Run Inference: Use the Fairseq code to run inference, translating sentences within the supported 200 languages.
- GPU Requirements: For optimal performance, especially with large models, consider using cloud GPUs such as those from AWS or Google Cloud.
License
The NLLB-200-3.3B model is licensed under the Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). This license allows for sharing and adaptation for non-commercial purposes, provided appropriate credit is given.