nllb 200 3.3 B LLM Model — Open LLM List

Introduction

NLLB-200-3.3B is a machine translation model developed by Meta, designed to translate single sentences across 200 languages. It is specifically intended for research purposes, with a focus on low-resource languages. This model is not recommended for production deployment or domain-specific texts such as medical or legal documents.

Architecture

The model leverages parallel multilingual data and monolingual data from Common Crawl for training. It uses a SentencePiece model for preprocessing, which is included with the release of NLLB-200.

Training

NLLB-200 was trained using a variety of data sources to address data imbalances in high and low-resource languages. The training methodology emphasizes fairness constraints and is detailed in the paper "No Language Left Behind: Scaling Human-Centered Machine Translation" by the NLLB Team et al. The training data includes both parallel multilingual datasets and monolingual datasets constructed from Common Crawl.

Guide: Running Locally

Setup: Ensure you have Python and PyTorch installed. Clone the repository from the Fairseq GitHub.
Dependencies: Install necessary dependencies using pip.
Download Model: Download the NLLB-200-3.3B model from Hugging Face.
Run Inference: Use the Fairseq code to run inference, translating sentences within the supported 200 languages.
GPU Requirements: For optimal performance, especially with large models, consider using cloud GPUs such as those from AWS or Google Cloud.

License

The NLLB-200-3.3B model is licensed under the Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). This license allows for sharing and adaptation for non-commercial purposes, provided appropriate credit is given.

More Related APIs in Translation