flaubert_base_cased
flaubertIntroduction
FlauBERT is a French language model based on the BERT architecture, pretrained on a large and diverse French corpus. It is designed to enhance French natural language processing (NLP) applications. FlauBERT is associated with FLUE, an evaluation framework akin to the GLUE benchmark for English, facilitating reproducible experiments and advancements in French NLP.
Architecture
FlauBERT models vary in size and complexity:
- flaubert-small-cased: 6 layers, 8 attention heads, 512 embedding dimension, 54 million parameters.
- flaubert-base-uncased: 12 layers, 12 attention heads, 768 embedding dimension, 137 million parameters.
- flaubert-base-cased: 12 layers, 12 attention heads, 768 embedding dimension, 138 million parameters.
- flaubert-large-cased: 24 layers, 16 attention heads, 1024 embedding dimension, 373 million parameters.
The small model is partially trained and mainly suited for debugging purposes.
Training
FlauBERT models were trained on the CNRS Jean Zay supercomputer using a large-scale French corpus, ensuring comprehensive language understanding capabilities. The training process supports the development of robust language models tailored for French NLP tasks.
Guide: Running Locally
To use FlauBERT with Hugging Face's Transformers library, follow these steps:
-
Install Requirements: Ensure you have PyTorch and Transformers installed.
pip install torch transformers
-
Load Model and Tokenizer:
import torch from transformers import FlaubertModel, FlaubertTokenizer modelname = 'flaubert/flaubert_base_cased' flaubert, log = FlaubertModel.from_pretrained(modelname, output_loading_info=True) flaubert_tokenizer = FlaubertTokenizer.from_pretrained(modelname, do_lowercase=False)
-
Inference Example:
sentence = "Le chat mange une pomme." token_ids = torch.tensor([flaubert_tokenizer.encode(sentence)]) last_layer = flaubert(token_ids)[0] print(last_layer.shape) cls_embedding = last_layer[:, 0, :]
-
Considerations: Use
do_lowercase=False
for cased models andTrue
for uncased models.
Cloud GPU Suggestion
To efficiently run FlauBERT, especially larger models, consider using cloud-based GPUs such as AWS EC2, Google Cloud Platform, or Azure.
License
FlauBERT is released under the MIT License, allowing for open use and modification.