nllb 200 distilled 1.3 B
facebookIntroduction
The NLLB-200-Distilled-1.3B model is a machine translation model developed for research purposes, especially targeting low-resource languages. It supports translations between 200 languages and is primarily intended for research communities within the field of machine translation.
Architecture
The NLLB-200 model is a distilled variant with 1.3 billion parameters. It leverages a multilingual approach to enable translation across a diverse range of languages. Its architecture and training methodologies are discussed in detail in the paper "No Language Left Behind: Scaling Human-Centered Machine Translation" by the NLLB Team.
Training
The model was trained using parallel multilingual data from various sources, as well as monolingual data from Common Crawl. The training process included handling data imbalances and using specific preprocessing techniques like SentencePiece for text data preparation. Evaluation metrics used include BLEU, spBLEU, and chrF++, with additional human evaluations conducted.
Guide: Running Locally
- Environment Setup: Install PyTorch and the Hugging Face
transformers
library. - Download Model: Use the Hugging Face Model Hub to download the NLLB-200-Distilled-1.3B model.
- Load and Use Model: Load the model in a Python script using the
transformers
library. - Cloud GPUs: For intensive computations, consider using cloud platforms like AWS, Google Cloud, or Azure for GPU support.
License
The NLLB-200-Distilled-1.3B model is licensed under CC-BY-NC-4.0, which allows for non-commercial use with attribution. For questions or comments, users can refer to the GitHub repository issues page for Fairseq.