magnum v2.5 12b kto
anthracite-orgIntroduction
MAGNUM-V2.5-12B-KTO is an experimental text generation model developed by Anthracite-ORG. It employs a hybrid reinforcement learning strategy combining KTO (Keep the Best) and DPOP (Drop the Poor), utilizing rejected data from the original model and "chosen" data from the finetuning dataset. The model aims to replicate the prose quality of Claude 3 models and is fine-tuned on anthracite-org/magnum-12b-v2.
Architecture
The model is built upon the anthracite-org/magnum-12b-v2 base model, with a focus on text generation and conversational applications. It supports nine languages, including English, French, German, Spanish, Italian, Portuguese, Russian, Chinese, and Japanese. The model has been instruct-tuned using ChatML formatting to enhance its interactive capabilities.
Training
MAGNUM-V2.5-12B-KTO was trained using a selection of datasets, such as the Stheno dataset, Opus_Instruct_25k, Opus_WritingStruct, Sonnet3.5-SlimOrcaDedupCleaned, and Opus_Instruct_3k. These datasets facilitated the model's fine-tuning, particularly for instruction-following data. The training strategy involved a hybrid reinforcement learning approach, optimizing for both prose quality and generalization.
Guide: Running Locally
To run the MAGNUM-V2.5-12B-KTO model locally, follow these steps:
- Environment Setup: Ensure you have Python and the Hugging Face Transformers library installed.
- Model Download: Use the Hugging Face Model Hub to download the model and its dependencies.
- Code Implementation: Implement the model using a script that follows the ChatML formatting for inputs.
- Execution: Run your script to interact with the model.
For better performance, especially when working with large models like MAGNUM-V2.5-12B-KTO, consider using cloud GPU services such as AWS EC2, Google Cloud Compute Engine, or Azure.
License
The MAGNUM-V2.5-12B-KTO model is released under the Apache-2.0 license, which allows for both personal and commercial use, distribution, and modification, provided that proper credit is given to the original creators.