tf allocine LLM Model — Open LLM List

Introduction

TF-ALLOCINÉ is a French sentiment analysis model based on CamemBERT, fine-tuned using a dataset of user reviews from Allociné.fr. It achieves high accuracy and F1 scores on validation and test datasets.

Architecture

The model leverages the CamemBERT architecture, a variant of the BERT model tailored for the French language. It is designed for text classification tasks, specifically sentiment analysis in this instance.

Training

The model was fine-tuned on a large-scale dataset scraped from Allociné.fr, a popular French movie review site. The training process yielded high performance metrics with a validation accuracy of 97.39% and a test accuracy of 97.44%.

Guide: Running Locally

To run the TF-ALLOCINÉ model locally, follow these steps:

Install the Hugging Face Transformers library:
```
pip install transformers
```

Import necessary classes and load the pre-trained model and tokenizer:

from transformers import AutoTokenizer, TFAutoModelForSequenceClassification, pipeline

tokenizer = AutoTokenizer.from_pretrained("tblard/tf-allocine")
model = TFAutoModelForSequenceClassification.from_pretrained("tblard/tf-allocine")

Create a sentiment analysis pipeline:

nlp = pipeline('sentiment-analysis', model=model, tokenizer=tokenizer)

Use the pipeline to analyze text sentiment:

print(nlp("Alad'2 est clairement le meilleur film de l'année 2018.")) # POSITIVE
print(nlp("NUL...A...CHIER ! FIN DE TRANSMISSION.")) # NEGATIVE

For optimal performance, it is recommended to use cloud GPUs, such as those available on AWS, Google Cloud, or Azure.

License

The model and associated code are available under the terms specified in the GitHub repository by Théophile Blard. Proper citation is requested if the work is used in any capacity.

More Related APIs in Text Classification