T pro it 1.0

t-tech

Introduction

T-PRO-IT-1.0 is a model from the Qwen 2.5 model family, designed for further fine-tuning rather than immediate deployment as a conversational assistant. Users are responsible for additional training and ensuring the model's responses meet ethical and safety standards, especially in industrial or commercial use cases.

Architecture

T-PRO-IT-1.0 builds upon continual pre-training and alignment techniques. It includes two stages of pre-training using a diverse dataset mix, followed by supervised fine-tuning and preference tuning to enhance its performance.

Training

  • Pre-training Stage 1: Utilizes 100B tokens from diverse Russian and replayed English data, including Common Crawl, books, code, and proprietary datasets.
  • Pre-training Stage 2: Consumes 40B tokens, combining instruction and pre-training data.
  • Supervised Fine-Tuning (SFT): Involves 1B tokens of diverse instruction data.
  • Preference Tuning: Further refines the model with 1B tokens to improve its helpfulness.

Proprietary and open-source benchmarks evaluate T-PRO-IT-1.0 against models like GPT-4o and Llama-3.3-70B-Instruct, showcasing competitive performance.

Guide: Running Locally

  1. Install the Transformers Library:

    pip install transformers
    
  2. Import Necessary Libraries:

    from transformers import AutoTokenizer, AutoModelForCausalLM
    import torch
    
  3. Load the Model and Tokenizer:

    model_name = "t-tech/T-pro-it-1.0"
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto")
    
  4. Generate Text:

    • Define your prompt and use the model to generate responses.
  5. Consider using Cloud GPUs:

    • Services like AWS, GCP, or Azure offer cloud GPU instances to efficiently run the model.

License

T-PRO-IT-1.0 is provided under a license that places responsibility for ethical and safe use on the deployer. Ensure compliance with all applicable legal requirements when using the model.

More Related APIs