Introduction

XGLM-7.5B is a multilingual autoregressive language model by Meta AI, featuring 7.5 billion parameters. It supports 31 languages and is designed for tasks such as text generation and few-shot learning. The model was introduced in the paper "Few-shot Learning with Multilingual Language Models."

Architecture

XGLM-7.5B is trained on a balanced dataset comprising 500 billion sub-tokens across various languages. It utilizes a transformer-based architecture and is implemented in PyTorch, supporting multilingual capabilities.

Training

The training data consists of a diverse set of languages with varying token distributions. English has the highest token count, followed by Russian and Chinese. The model employs low-resource language upsampling to balance the data. The detailed statistics of token distribution across languages are available in the documentation.

Guide: Running Locally

To run XGLM-7.5B locally, follow these steps:

  1. Install Dependencies: Ensure you have Python and PyTorch installed. Install the transformers library using pip:

    pip install transformers
    
  2. Load Model and Tokenizer:

    from transformers import XGLMTokenizer, XGLMForCausalLM
    
    tokenizer = XGLMTokenizer.from_pretrained("facebook/xglm-7.5B")
    model = XGLMForCausalLM.from_pretrained("facebook/xglm-7.5B")
    
  3. Run Inference: Use the model for text generation or evaluation tasks such as COPA (Choice of Plausible Alternatives).

  4. Suggest Cloud GPUs: For better performance, especially when dealing with large models, consider using cloud-based GPUs from providers like AWS, Google Cloud Platform, or Azure.

License

XGLM-7.5B is released under the MIT License, allowing for open use and modification of the model as per the license terms.

More Related APIs in Text Generation