french camembert postag model
gilfIntroduction
The french-camembert-postag-model
is a part-of-speech tagging model designed for the French language. It was trained on the Free French Treebank dataset and utilizes the camembert-base
tokenizer and model.
Architecture
The model is based on CamemBERT, a transformer model specifically designed for the French language. It supports a variety of tags to classify parts of speech, including nouns, verbs, adjectives, and more.
Training
This model was trained using the Free French Treebank dataset, which is accessible on GitHub. It leverages the camembert-base
architecture for tokenization and classification.
Guide: Running Locally
To use this model locally, follow these steps:
-
Install Transformers:
pip install transformers
-
Load Model and Tokenizer:
from transformers import AutoTokenizer, AutoModelForTokenClassification, pipeline tokenizer = AutoTokenizer.from_pretrained("gilf/french-camembert-postag-model") model = AutoModelForTokenClassification.from_pretrained("gilf/french-camembert-postag-model") nlp_token_class = pipeline('ner', model=model, tokenizer=tokenizer, grouped_entities=True) result = nlp_token_class('Face à un choc inédit, les mesures mises en place par le gouvernement ont permis une protection forte et efficace des ménages') print(result)
-
Suggested Cloud GPUs: For enhanced performance, consider using cloud-based GPU solutions like AWS EC2, Google Cloud Platform, or Azure Virtual Machines.
License
The usage of the french-camembert-postag-model
follows the licensing terms provided by Hugging Face and the dataset's original repository, ensuring proper use and distribution of the model and dataset.