Last Updated | Version | Changes |
---|---|---|
7/7/2023 | 1.0 | First version! This document will be kept up to date with SDXL developments in the run-up to release. |
7/9/2023 | 1.2 | Stability have released 0.9 to all (with restrictions)! |
7/10/2023 | 1.3 | Official SDXL ComfyUI Workflow Released |
7/11/2023 | 1.4 | Models on HuggingFace are back behind the application! |
7/12/2023 | 1.5 | Xformers vs SDP-cross-attention for ComfyUI |
<aside> 🚫 7/11/2023 - Downloads on HuggingFace again require an application and approval, but all applications are accepted! What a wild ride.
It appears that SDXL 0.9 has been released by Stability for all who wish to download it! This happened quietly, and we haven’t seen any official announcement.
~~SDXL 0.9 (SDXL 1.0) is due to for general release on 7/18, however a number of testers and community creators have access to the model early. The model has also leaked to torrent sites, and can be found in the wild~~. Civitai has uploaded the model, but will not be opening access until authorized to do so by Stability. Additionally, there are malicious files floating around masquerading as SDXL.
SD XL will only be released as .safetensors format. Never download/run a suspicious .ckpt file!
</aside>
<aside> 🚨 SDXL 0.9 is exclusively intended for research purposes, and has a non-commercial, research only license. The release candidate model, SDXL 1.0, is expected to be somewhat more advanced, and will presumably have a more permissive license.
</aside>
<aside> ❓ **Per Stability,** what’s released on the 18th might not be the full 1.0 model, and might be something between 0.9 and 1.0. The information from Stability changes daily, so I’ll try to keep this guide up to date in the run-up to launch!
</aside>
SDXL 0.9 is a groundbreaking new text-to-image model, and a stepping-stone to SDXL 1.0, which will be released on or around July 18th. Technologically, it’s a leap forward from SD 1.5 or 2.x, boasting a parameter count (the sum of all the weights and biases in the neural network that the model is trained on) of 3.5 billion for the base model and a 6.6 billion for the second stage refiner. In contrast, SD 1.4 has just ~890 million parameters.
Further information about the inner workings of SDXL can be found on Stability AI’s SDXL research paper here;
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis
Aside from ~3x more training parameters than previous SD models, SDXL runs on two CLIP models, including the largest OpenCLIP model trained to-date (OpenCLIP ViT-G/14), and has a far higher native resolution of 1024x1024, in contrast to SD 1.4/5’s 512x512, allowing for greatly improved fidelity and depth.
Unlike previous SD models, SDXL uses a two-stage image creation process. The base model generates the initial latent image (txt2img), before passing the output and the same prompt through a refiner model (essentially an img2img workflow), upscaling, and adding fine detail to the generated output.
Both the SDXL Base model and Refiner will be released on or around July 18th.