camembert base squad F R fquad piaf LLM Model

Introduction

The CamemBERT-Base-SQuADFR-FQuAD-PIAF model is a French question-answering model based on the CamemBERT architecture. It has been fine-tuned using a combination of three French question-answering datasets: PIAF, FQuAD, and SQuAD-FR. This model is designed to understand and answer questions in French within given contexts.

Architecture

The model utilizes the CamemBERT architecture, which is a French adaptation of the BERT model. CamemBERT is tailored specifically for the French language, allowing for better performance in tasks involving French text. The model is implemented using the Transformers library and supports both PyTorch and TensorFlow frameworks.

Training

The training process involved fine-tuning the CamemBERT base model using the following datasets:

PIAF v1.1: A native French question-answering dataset.
FQuAD v1.0: A French question-answering dataset.
SQuAD-FR: The SQuAD dataset automatically translated into French.

Training was conducted with the following hyperparameters:

Model type: CamemBERT
Batch size: 12 (per GPU)
Learning rate: 3e-5
Number of epochs: 4
Maximum sequence length: 384
Document stride: 128
Save steps: 10,000

The model achieved an F1 score of 79.81 and an exact match score of 55.14 on the FQuAD dataset, and an F1 score of 80.61 and an exact match score of 59.54 on the SQuAD-FR dataset.

Guide: Running Locally

To run the CamemBERT-Base-SQuADFR-FQuAD-PIAF model locally, follow these steps:

Install the Transformers library:
```
pip install transformers
```

Use the model in a Python script:

from transformers import pipeline

nlp = pipeline('question-answering', model='etalab-ia/camembert-base-squadFR-fquad-piaf', tokenizer='etalab-ia/camembert-base-squadFR-fquad-piaf')

result = nlp({
    'question': "Qui est Claude Monet?",
    'context': "Claude Monet, né le 14 novembre 1840 à Paris et mort le 5 décembre 1926 à Giverny, est un peintre français et l’un des fondateurs de l'impressionnisme."
})

print(result)

Consider using cloud GPUs for faster processing, such as those offered by AWS, Google Cloud, or Azure.

License

The model is released under the Apache 2.0 License, allowing for both personal and commercial use, modification, and distribution. Ensure compliance with the license terms when using the model.