Omni Gen v1
BAAIIntroduction
OmniGen is a unified image generation model designed to generate a wide range of images from multi-modal prompts. It aims to simplify the image generation process by eliminating the need for additional network modules and preprocessing steps. OmniGen allows for easy customization and fine-tuning, facilitating the creation of diverse and creative image-generation tasks.
Architecture
OmniGen operates without requiring additional plugins or operations, automatically identifying features in input images based on text prompts. It supports various tasks, including text-to-image generation, subject-driven generation, identity-preserving generation, image editing, and image-conditioned generation.
Training
OmniGen can be fine-tuned using the provided training script train.py
, which supports techniques such as LoRA (Low-Rank Adaptation). Users can adjust parameters like learning rate, batch size, and dropout probability to optimize the model for specific tasks. Detailed instructions for fine-tuning are available in the documentation.
Guide: Running Locally
Basic Steps
-
Clone the Repository and Install Dependencies:
git clone https://github.com/staoxiao/OmniGen.git cd OmniGen pip install -e .
-
Create a Python Environment:
- Use Conda (recommended):
conda create -n omnigen python=3.10.12 conda activate omnigen
- Use Conda (recommended):
-
Install PyTorch:
- Install the appropriate version of PyTorch for your CUDA version:
pip install torch==2.3.1+cu118 torchvision --extra-index-url https://download.pytorch.org/whl/cu118
- Install the appropriate version of PyTorch for your CUDA version:
-
Run Examples:
- Import and use the OmniGen pipeline:
from OmniGen import OmniGenPipeline pipe = OmniGenPipeline.from_pretrained("Shitao/OmniGen-v1") images = pipe(prompt="A curly-haired man in a red shirt is drinking tea.", height=1024, width=1024, guidance_scale=2.5, seed=0) images[0].save("example_t2i.png")
- Import and use the OmniGen pipeline:
-
Use Cloud GPUs:
- For resource-intensive tasks, consider using cloud GPUs on platforms like Google Colab.
Additional Resources
- For more examples and detailed instructions, refer to
inference.ipynb
andinference_demo.ipynb
. - For efficient resource management, consult
docs/inference.md
.
License
This repository is licensed under the MIT License.