gpt2 french small LLM Model

Introduction

The GPT2-FRENCH-SMALL model is a French language text generation model derived from OpenAI's GPT-2 small model. It was trained on a limited dataset (190MB) from French Wikipedia using transfer learning and fine-tuning techniques over the course of a little more than a day. The training was conducted on a Google Colab Pro environment with a single GPU of 16GB. This model serves as a proof-of-concept for generating language models in any language with minimal resources.

Architecture

The model architecture is based on the GPT-2 small configuration by OpenAI, adapted for the French language through fine-tuning on a specific dataset. The fine-tuning process utilized the Hugging Face Transformers and Tokenizers libraries, integrated into the fastai v2 deep learning framework.

Training

Training involved fine-tuning the pre-trained English GPT-2 small model using a French Wikipedia dataset. This approach demonstrates that creating a language model in different languages can be achieved efficiently with limited resources. The model can potentially be improved with a larger dataset and more powerful training infrastructure.

Guide: Running Locally

To run the model locally, follow these steps:

Install Dependencies: Ensure you have Python installed, and install the Hugging Face Transformers library using pip:
```
pip install transformers
```

Load the Model: Use the Transformers library to load the model:

from transformers import GPT2LMHeadModel, GPT2Tokenizer

model_name = "dbddv01/gpt2-french-small"
model = GPT2LMHeadModel.from_pretrained(model_name)
tokenizer = GPT2Tokenizer.from_pretrained(model_name)

Inference: Generate text by passing a prompt to the model:

input_text = "Votre texte ici"
input_ids = tokenizer.encode(input_text, return_tensors='pt')
output = model.generate(input_ids, max_length=100)
print(tokenizer.decode(output[0], skip_special_tokens=True))

Recommended Environment: For optimal performance, consider using cloud-based GPUs, such as those available on Google Colab, AWS, or Azure.

License

The use of this model is subject to the licensing terms provided by the original creator, dbddv01, and the Hugging Face platform. Be sure to review and comply with any applicable license agreements.

More Related APIs in Text Generation