xglm 7.5 B
facebookIntroduction
XGLM-7.5B is a multilingual autoregressive language model by Meta AI, featuring 7.5 billion parameters. It supports 31 languages and is designed for tasks such as text generation and few-shot learning. The model was introduced in the paper "Few-shot Learning with Multilingual Language Models."
Architecture
XGLM-7.5B is trained on a balanced dataset comprising 500 billion sub-tokens across various languages. It utilizes a transformer-based architecture and is implemented in PyTorch, supporting multilingual capabilities.
Training
The training data consists of a diverse set of languages with varying token distributions. English has the highest token count, followed by Russian and Chinese. The model employs low-resource language upsampling to balance the data. The detailed statistics of token distribution across languages are available in the documentation.
Guide: Running Locally
To run XGLM-7.5B locally, follow these steps:
-
Install Dependencies: Ensure you have Python and PyTorch installed. Install the
transformers
library using pip:pip install transformers
-
Load Model and Tokenizer:
from transformers import XGLMTokenizer, XGLMForCausalLM tokenizer = XGLMTokenizer.from_pretrained("facebook/xglm-7.5B") model = XGLMForCausalLM.from_pretrained("facebook/xglm-7.5B")
-
Run Inference: Use the model for text generation or evaluation tasks such as COPA (Choice of Plausible Alternatives).
-
Suggest Cloud GPUs: For better performance, especially when dealing with large models, consider using cloud-based GPUs from providers like AWS, Google Cloud Platform, or Azure.
License
XGLM-7.5B is released under the MIT License, allowing for open use and modification of the model as per the license terms.