S D3.5 Large Photorealistic Lo R A

prithivMLmods

Introduction

The SD3.5-Large-Photorealistic-LoRA model, developed by prithivMLmods, is a text-to-image model designed for creating photorealistic images. It is built using the stable-diffusion-3.5-large base model and utilizes LoRA (Low-Rank Adaptation) for enhanced performance in image generation tasks.

Architecture

This model employs the StableDiffusion3Pipeline framework for generating images. It uses LoRA weights to adapt the base model, allowing for improved results in photorealistic scenarios. Key parameters include a learning rate scheduler set to constant, an AdamW optimizer, and specific settings for noise and network dimensions.

Training

The model is currently in the training phase, with 40 images used for training. The training process involves parameters such as a learning rate scheduler (constant), noise offset (0.03), and an optimizer (AdamW). Multires Noise Discount and Iterations are set to 0.1 and 10, respectively.

Guide: Running Locally

To run the model locally, follow these steps:

  1. Environment Setup:

    • Ensure you have PyTorch installed with CUDA support for GPU acceleration.
    • Install the diffusers library from Hugging Face.
  2. Load the Model:

    import torch
    from diffusers import StableDiffusion3Pipeline
    
    pipe = StableDiffusion3Pipeline.from_pretrained("stabilityai/stable-diffusion-3.5-large", torch_dtype=torch.bfloat16)
    pipe.load_lora_weights("prithivMLmods/SD3.5-Large-Photorealistic-LoRA", weight_name="Photorealistic-SD3.5-Large-LoRA.safetensors")
    pipe.fuse_lora(lora_scale=1.0)
    pipe.to("cuda")
    
  3. Generate an Image:

    prompt = "Man in the style of dark beige and brown, uhd image, youthful protagonists, nonrepresentational photography"
    negative_prompt = "(lowres, low quality, worst quality)"
    
    image = pipe(prompt=prompt,
                 negative_prompt=negative_prompt,
                 num_inference_steps=24, 
                 guidance_scale=4.0,
                 width=960, height=1280,
                ).images[0]
    image.save(f"example.jpg")
    
  4. Use Cloud GPUs:

    • For optimal performance, consider using cloud GPU services such as AWS, Google Cloud, or Azure.

License

The model is licensed under the creativeml-openrail-m, allowing for open and flexible usage in various projects.

More Related APIs in Text To Image