Chocolatine Admin 3 B S F T v0.3b
jpacificoIntroduction
Chocolatine-Admin-3B-SFT-v0.3b is a French language model specialized in administrative language, developed by Jonathan Pacifico in collaboration with Microsoft. It is fine-tuned from the Chocolatine-3B-Instruct-DPO-v1.2 model, based on Microsoft's Phi-3.5-mini-instruct.
Architecture
The model is a version of the Chocolatine series, specifically fine-tuned for French administrative language tasks. It leverages the architecture of the Phi-3.5-mini-instruct model by Microsoft and incorporates modifications for handling administrative terminology.
Training
The model was trained using a dataset derived from the French DITP's official administrative lexicon, consisting of 2362 terms. The data preparation included extracting lexicon pages, reformulating definitions for readability, and generating prompt-answer pairs. The fine-tuning process involved 11 epochs on an A100 GPU instance via Azure Machine Learning.
Guide: Running Locally
To run Chocolatine-Admin-3B-SFT-v0.3b locally, follow these steps:
- Install Transformers: Ensure that the
transformers
library is installed in your environment. - Import Libraries:
import transformers from transformers import AutoTokenizer
- Prepare the Prompt: Format your message as a prompt.
message = [ {"role": "system", "content": "You are a helpful assistant chatbot."}, {"role": "user", "content": "What is a Large Language Model?"} ] tokenizer = AutoTokenizer.from_pretrained(new_model) prompt = tokenizer.apply_chat_template(message, add_generation_prompt=True, tokenize=False)
- Create a Pipeline: Set up a text generation pipeline.
pipeline = transformers.pipeline( "text-generation", model=new_model, tokenizer=tokenizer )
- Generate Text: Use the pipeline to generate text sequences.
sequences = pipeline( prompt, do_sample=True, temperature=0.7, top_p=0.9, num_return_sequences=1, max_length=200, ) print(sequences[0]['generated_text'])
For optimal performance, consider using cloud GPUs such as those provided by Azure.
License
Chocolatine-Admin-3B-SFT-v0.3b is licensed under the MIT License, allowing for flexibility in usage and distribution.