french camembert postag model

gilf

Introduction

The french-camembert-postag-model is a part-of-speech tagging model designed for the French language. It was trained on the Free French Treebank dataset and utilizes the camembert-base tokenizer and model.

Architecture

The model is based on CamemBERT, a transformer model specifically designed for the French language. It supports a variety of tags to classify parts of speech, including nouns, verbs, adjectives, and more.

Training

This model was trained using the Free French Treebank dataset, which is accessible on GitHub. It leverages the camembert-base architecture for tokenization and classification.

Guide: Running Locally

To use this model locally, follow these steps:

  1. Install Transformers:

    pip install transformers
    
  2. Load Model and Tokenizer:

    from transformers import AutoTokenizer, AutoModelForTokenClassification, pipeline
    
    tokenizer = AutoTokenizer.from_pretrained("gilf/french-camembert-postag-model")
    model = AutoModelForTokenClassification.from_pretrained("gilf/french-camembert-postag-model")
    nlp_token_class = pipeline('ner', model=model, tokenizer=tokenizer, grouped_entities=True)
    
    result = nlp_token_class('Face à un choc inédit, les mesures mises en place par le gouvernement ont permis une protection forte et efficace des ménages')
    print(result)
    
  3. Suggested Cloud GPUs: For enhanced performance, consider using cloud-based GPU solutions like AWS EC2, Google Cloud Platform, or Azure Virtual Machines.

License

The usage of the french-camembert-postag-model follows the licensing terms provided by Hugging Face and the dataset's original repository, ensuring proper use and distribution of the model and dataset.

More Related APIs in Token Classification