Simple Stories 125 M
lennart-finkeIntroduction
SimpleStories-125M is a text generation model designed to produce coherent and simple English narratives. It leverages model distillation for efficient performance and is accessible via the Hugging Face Model Hub.
Architecture
The model utilizes the Llama architecture, which is a transformer-based neural network. It incorporates PyTorchModelHubMixin for easy integration with Hugging Face's ecosystem, allowing seamless model loading and usage.
Training
SimpleStories-125M was trained using distilled techniques to optimize model size and performance. The training process is managed through the simple_stories_train
repository, which houses the necessary scripts and configurations for replicating the training environment.
Guide: Running Locally
To run SimpleStories-125M locally, follow these steps:
-
Install Dependencies: Ensure you have Python and PyTorch installed. Use
pip
to install the Hugging Face Hub library:pip install huggingface-hub torch
-
Load the Model: Use the following Python script to load SimpleStories-125M:
from simple_stories_train.models.llama import Llama, LlamaConfig from simple_stories_train.models.model_configs import MODEL_CONFIGS_DICT from huggingface_hub import PyTorchModelHubMixin class LlamaTransformer(nn.Module, PyTorchModelHubMixin): def __init__(self, **config): super().__init__() self.llama = Llama(LlamaConfig(**config)) def forward(self, x): return self.llama(x) config = MODEL_CONFIGS_DICT["d12"] model = LlamaTransformer(**config) model = model.from_pretrained("lennart-finke/SimpleStories-125M")
-
Cloud GPU Recommendation: Consider using cloud services such as AWS, Google Cloud, or Azure for access to powerful GPUs, which can enhance inference performance.
License
The model and its associated code are available under licenses specified in the simple_stories_train repository. Ensure compliance with these licenses when using the model.