Introduction

SMaLL-100 is a compact, fast, and massively multilingual machine translation model. It is designed to handle over 10,000 language pairs, achieving competitive results with larger models like M2M-100 while being more efficient. This model is introduced in a paper presented at EMNLP 2022.

Architecture

The SMaLL-100 model shares its architecture and configuration with the M2M-100 model but features a modified tokenizer to adjust language codes. This modification allows users to load the tokenizer locally from a specified file. The model is designed to be smaller and faster than its counterparts while maintaining high performance, particularly for low-resource languages.

Training

SMaLL-100 is trained as a sequence-to-sequence model for translation tasks. The training process involves providing the model with source language text concatenated with a target language code and target language text. The model's tokenizer requires the sentencepiece library:

pip install sentencepiece

Training data is available upon request, and the model utilizes a beam size of 5 and a maximum target length of 256 during generation.

Guide: Running Locally

To run SMaLL-100 locally, you need to set up the model and tokenizer:

  1. Install the required packages:

    pip install transformers sentencepiece
    
  2. Load the model and tokenizer:

    from transformers import M2M100ForConditionalGeneration
    from tokenization_small100 import SMALL100Tokenizer
    
    model = M2M100ForConditionalGeneration.from_pretrained("alirezamsh/small100")
    tokenizer = SMALL100Tokenizer.from_pretrained("alirezamsh/small100", tgt_lang="fr")
    
  3. Prepare text for translation and generate output:

    src_text = "Your text here."
    model_inputs = tokenizer(src_text, return_tensors="pt")
    generated_tokens = model.generate(**model_inputs)
    translated_text = tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)
    

For optimal performance, consider using cloud GPUs such as AWS EC2, Google Cloud Platform, or Azure.

License

The SMaLL-100 model is licensed under the MIT License, allowing for broad use and modification.

More Related APIs in Translation