T pro it 1.0
t-techIntroduction
T-PRO-IT-1.0 is a model from the Qwen 2.5 model family, designed for further fine-tuning rather than immediate deployment as a conversational assistant. Users are responsible for additional training and ensuring the model's responses meet ethical and safety standards, especially in industrial or commercial use cases.
Architecture
T-PRO-IT-1.0 builds upon continual pre-training and alignment techniques. It includes two stages of pre-training using a diverse dataset mix, followed by supervised fine-tuning and preference tuning to enhance its performance.
Training
- Pre-training Stage 1: Utilizes 100B tokens from diverse Russian and replayed English data, including Common Crawl, books, code, and proprietary datasets.
- Pre-training Stage 2: Consumes 40B tokens, combining instruction and pre-training data.
- Supervised Fine-Tuning (SFT): Involves 1B tokens of diverse instruction data.
- Preference Tuning: Further refines the model with 1B tokens to improve its helpfulness.
Proprietary and open-source benchmarks evaluate T-PRO-IT-1.0 against models like GPT-4o and Llama-3.3-70B-Instruct, showcasing competitive performance.
Guide: Running Locally
-
Install the Transformers Library:
pip install transformers
-
Import Necessary Libraries:
from transformers import AutoTokenizer, AutoModelForCausalLM import torch
-
Load the Model and Tokenizer:
model_name = "t-tech/T-pro-it-1.0" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto")
-
Generate Text:
- Define your prompt and use the model to generate responses.
-
Consider using Cloud GPUs:
- Services like AWS, GCP, or Azure offer cloud GPU instances to efficiently run the model.
License
T-PRO-IT-1.0 is provided under a license that places responsibility for ethical and safe use on the deployer. Ensure compliance with all applicable legal requirements when using the model.