flaubert_base_cased LLM Model

Introduction

FlauBERT is a French language model based on the BERT architecture, pretrained on a large and diverse French corpus. It is designed to enhance French natural language processing (NLP) applications. FlauBERT is associated with FLUE, an evaluation framework akin to the GLUE benchmark for English, facilitating reproducible experiments and advancements in French NLP.

Architecture

FlauBERT models vary in size and complexity:

flaubert-small-cased: 6 layers, 8 attention heads, 512 embedding dimension, 54 million parameters.
flaubert-base-uncased: 12 layers, 12 attention heads, 768 embedding dimension, 137 million parameters.
flaubert-base-cased: 12 layers, 12 attention heads, 768 embedding dimension, 138 million parameters.
flaubert-large-cased: 24 layers, 16 attention heads, 1024 embedding dimension, 373 million parameters.

The small model is partially trained and mainly suited for debugging purposes.

Training

FlauBERT models were trained on the CNRS Jean Zay supercomputer using a large-scale French corpus, ensuring comprehensive language understanding capabilities. The training process supports the development of robust language models tailored for French NLP tasks.

Guide: Running Locally

To use FlauBERT with Hugging Face's Transformers library, follow these steps:

Install Requirements: Ensure you have PyTorch and Transformers installed.
```
pip install torch transformers
```

Load Model and Tokenizer:

import torch
from transformers import FlaubertModel, FlaubertTokenizer

modelname = 'flaubert/flaubert_base_cased'
flaubert, log = FlaubertModel.from_pretrained(modelname, output_loading_info=True)
flaubert_tokenizer = FlaubertTokenizer.from_pretrained(modelname, do_lowercase=False)

Inference Example:

sentence = "Le chat mange une pomme."
token_ids = torch.tensor([flaubert_tokenizer.encode(sentence)])
last_layer = flaubert(token_ids)[0]
print(last_layer.shape)
cls_embedding = last_layer[:, 0, :]

Considerations: Use do_lowercase=False for cased models and True for uncased models.

Cloud GPU Suggestion

To efficiently run FlauBERT, especially larger models, consider using cloud-based GPUs such as AWS EC2, Google Cloud Platform, or Azure.

License

FlauBERT is released under the MIT License, allowing for open use and modification.

More Related APIs in Fill Mask