Llama Deepsync 1 B
prithivMLmodsIntroduction
The Llama-Deepsync-1B is a fine-tuned version of the Llama-3.2-1B-Instruct model developed for text generation tasks. It is optimized for deep reasoning, logical structuring, and problem-solving, making it suitable for education, programming, and creative writing. It supports multilingual text generation and can generate step-by-step solutions and structured outputs like JSON.
Architecture
Llama 3.2 utilizes an auto-regressive language model with an optimized transformer architecture. It employs supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. The model supports long-context capabilities, handling up to 128K tokens.
Training
The model is trained using a combination of supervised fine-tuning and reinforcement learning with human feedback to improve its performance in instruction following, coding, mathematics, and generating long texts. It also offers multilingual support for over 29 languages.
Guide: Running Locally
To run the Llama-Deepsync-1B model locally:
- Install Transformers: Ensure you have
transformers >= 4.43.0
by runningpip install --upgrade transformers
. - Load the Model: Use the Transformers library to load and run the model.
import torch from transformers import pipeline model_id = "prithivMLmods/Llama-Deepsync-1B" pipe = pipeline( "text-generation", model=model_id, torch_dtype=torch.bfloat16, device_map="auto", )
- Interact with the Model: Use the pipeline to generate text with role-based messages.
messages = [ {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"}, {"role": "user", "content": "Who are you?"}, ] outputs = pipe( messages, max_new_tokens=256, ) print(outputs[0]["generated_text"][-1])
For cloud GPU support, consider platforms like AWS, Google Cloud, or Azure to enhance performance.
License
The Llama-Deepsync-1B model is licensed under the CreativeML Open RAIL-M license.