TinyLLama-v0

Introduction

TinyLLama-v0 is an initial version aimed at recreating the roneneldan/TinyStories-1M model using the Llama architecture. This model focuses on text generation and is implemented using the PyTorch framework. It is distributed under the Apache 2.0 license.

Architecture

The model utilizes the Llama architecture with a tokenizer from open_llama_3b. It faces some local compatibility issues with the tokenizer, but these are resolved in cloud environments with pre-installed libraries. The model is built to generate text, and it truncates stories that exceed the context size without employing a sliding window for training.

Training

The training process is documented in the train.ipynb notebook. To begin, download TinyStoriesV2-GPT4-train.txt and TinyStoriesV2-GPT4-valid.txt into the same directory as the notebook and execute the cells. Training spans approximately 9 hours, utilizing a 40GB A100 GPU with around 30GB VRAM. The process does not fully utilize the validation content, allowing flexibility in its content. A basic caching mechanism is employed for shuffling stories, which will be refined in future versions.

Guide: Running Locally

Setup Environment: Ensure that you have Python and PyTorch installed.
Download Resources: Obtain the TinyStoriesV2-GPT4-train.txt and TinyStoriesV2-GPT4-valid.txt files.
Run Training: Execute the train.ipynb notebook to start training the model.
Validation: Use the valid.py script with the command python valid.py path/to/TinyStoriesV2-GPT4-valid.txt.
Demo: A demonstration script is available as demo.py.

Cloud GPUs

For efficient training, consider using cloud GPUs such as NVIDIA's A100, which provides adequate VRAM and performance.

License

This project is licensed under the Apache-2.0 License.

More Related APIs in Text Generation