glide base

fusing

Introduction

GLIDE is a model designed for photorealistic image generation and editing using text-guided diffusion models. It explores the effectiveness of diffusion models in text-conditional image synthesis, comparing CLIP guidance and classifier-free guidance. Human evaluations favored classifier-free guidance for both photorealism and caption similarity. The model excels in generating and editing images, outperforming DALL-E under certain conditions.

Architecture

GLIDE utilizes a text-conditional diffusion model with 3.5 billion parameters. The architecture supports classifier-free guidance, which enhances image quality by balancing diversity and fidelity. This model is capable of fine-tuning for tasks like image inpainting, allowing detailed image editing based on text prompts.

Training

Training involves using diffusion techniques paired with guidance strategies to improve image synthesis. The model is trained to optimize photorealism and textual relevance, and it can be fine-tuned for specific tasks like inpainting. The training process leverages large-scale datasets to refine the model's capabilities in generating high-quality images.

Guide: Running Locally

To run GLIDE locally, follow these steps:

  1. Install Required Libraries:
    pip install diffusers torch PIL
    
  2. Load and Use the Model:
    import torch
    from diffusers import DiffusionPipeline
    import PIL.Image
    
    model_id = "fusing/glide-base"
    pipeline = DiffusionPipeline.from_pretrained(model_id)
    
    img = pipeline("a crayon drawing of a corgi")
    img = img.squeeze(0)
    img = ((img + 1)*127.5).round().clamp(0, 255).to(torch.uint8).cpu().numpy()
    image_pil = PIL.Image.fromarray(img)
    image_pil.save("test.png")
    
  3. Cloud GPUs: For improved performance, consider using cloud GPU services like AWS EC2, Google Cloud Platform, or Azure to handle the computational demands of the model.

License

GLIDE is licensed under the Apache-2.0 License, allowing for use, modification, and distribution under specified terms.

More Related APIs