Tiny Llama 1.1 B Chat v1.0 LLM Model

Introduction

TinyLlama-1.1B-Chat-v1.0 is a conversational AI model based on the Llama architecture, optimized for efficient deployment in scenarios with limited computational resources. It is designed to generate human-like text in response to user prompts.

Architecture

TinyLlama-1.1B retains the architecture and tokenizer of Llama 2, offering compatibility with various open-source projects. It contains 1.1 billion parameters, making it compact and suitable for applications with constrained computational and memory capacities.

Training

The model is fine-tuned from the TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T checkpoint using Hugging Face's Zephyr training methodology. Initially trained on a version of the UltraChat dataset, the model was further refined using the UltraFeedback dataset, which includes 64,000 ranked prompts and model completions.

Guide: Running Locally

To run the TinyLlama-1.1B-Chat-v1.0 model locally, follow these steps:

Install Dependencies:
- Ensure you have transformers version 4.34 or higher.
- Install necessary packages:
```
pip install git+https://github.com/huggingface/transformers.git
pip install accelerate
```

Setup and Run:

Use the following Python script to generate text:

import torch
from transformers import pipeline

pipe = pipeline("text-generation", model="TinyLlama/TinyLlama-1.1B-Chat-v1.0", torch_dtype=torch.bfloat16, device_map="auto")

messages = [
    {
        "role": "system",
        "content": "You are a friendly chatbot who always responds in the style of a pirate",
    },
    {"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
]
prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipe(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])

Consider Cloud GPUs: To efficiently handle the model's computational demands, consider utilizing cloud-based GPU services such as AWS, Google Cloud, or Azure.

License

The TinyLlama-1.1B-Chat-v1.0 model is licensed under the Apache-2.0 license, which allows for broad usage, modification, and distribution of the software.

More Related APIs in Text Generation