Tiny L Lama v0
MaykeyeTinyLLama-v0
Introduction
TinyLLama-v0 is an initial version aimed at recreating the roneneldan/TinyStories-1M
model using the Llama architecture. This model focuses on text generation and is implemented using the PyTorch framework. It is distributed under the Apache 2.0 license.
Architecture
The model utilizes the Llama architecture with a tokenizer from open_llama_3b
. It faces some local compatibility issues with the tokenizer, but these are resolved in cloud environments with pre-installed libraries. The model is built to generate text, and it truncates stories that exceed the context size without employing a sliding window for training.
Training
The training process is documented in the train.ipynb
notebook. To begin, download TinyStoriesV2-GPT4-train.txt
and TinyStoriesV2-GPT4-valid.txt
into the same directory as the notebook and execute the cells. Training spans approximately 9 hours, utilizing a 40GB A100 GPU with around 30GB VRAM. The process does not fully utilize the validation content, allowing flexibility in its content. A basic caching mechanism is employed for shuffling stories, which will be refined in future versions.
Guide: Running Locally
- Setup Environment: Ensure that you have Python and PyTorch installed.
- Download Resources: Obtain the
TinyStoriesV2-GPT4-train.txt
andTinyStoriesV2-GPT4-valid.txt
files. - Run Training: Execute the
train.ipynb
notebook to start training the model. - Validation: Use the
valid.py
script with the commandpython valid.py path/to/TinyStoriesV2-GPT4-valid.txt
. - Demo: A demonstration script is available as
demo.py
.
Cloud GPUs
For efficient training, consider using cloud GPUs such as NVIDIA's A100, which provides adequate VRAM and performance.
License
This project is licensed under the Apache-2.0 License.