Stable Diffusion General FAQ

If you can read this catbox is down

confuzzled

Welcome

This FAQ will collect quick answers to commonly asked questions. It is the successor to the original trashfaq which is currently not updated anymore. Relevant topics have been copied and updated into this version and the sdgguide rentry. If you must use older SD1.5 based models because of hardware limitations, the trashfaq will contain useful information for you.

A detailed general guide exists here https://rentry.org/sdgguide (WIP)

  1. It's 2025, I came back after a year, what do?
  2. What models are currently popular?
  3. What does VPRED and EPS mean?
  4. My gens are fried, help!
  5. Why are there always additional chars in the backgrounds even though I specify (solo)?
  6. How can I prevent inpaint halos?
  7. What are the brackets [], parentheses () and other {}+ stuff for?
  8. How do I use the BREAK statement? Can I define separate characters with it?
  9. What CLIP SKIP do I need to use?
  10. Which VAE do I need?
  11. Sometimes i get just black or corrupted images when using (re)Forge
  12. What is the best upscaler for hiresfix or img2img?
  13. How can I gen ...
    1. ... gryphons
    2. ... coherent text with text2img
    3. ... a character model sheet with different views?
  14. How can I add text to a t-shirt so it flows with the fabric?
  15. Where is the current trashcollects_lora?
  16. What is a Catbox?
  17. How can I add (hidden) options to my UI? Quicksettings (Forge/reForge)
  18. Where can I get current Embeddings / Textual Inversions?
  19. Local video gens? what?
  20. I get strange localized artifacts in my gens
  21. Where do I get a current danbooru & e621 combined tag autocomplete csv?
  22. How to use the "send to workflow" feature for ComfyUI?
  23. Can I extract tags from images that have no metadata?
  24. Why are my snow leopard gens always in winter?
  25. Why do a lot of carrots appear in my gen when I use the carrot \(artist\) tag?
  26. When I interrupt generation on (re)Forge it takes a long time to finally stop, why?
  27. How can I use the same model, lora, etc directories for (re)Forge, ComfyUI, InvokeAI?
  28. What resolution/ratio should I use with SDXL based models?
  29. Why does img2img/inpaint in (re)Forge do much less steps than I specify?

transparent
lightgrey

It's 2025, I came back after a year, what do?

If you are still using the original AUTOMATIC1111 WebUI, its recommended to change to (re)Forge or ComfyUI as its pretty outdated and rarely updated

General advice (especially when you were using SD1.5)

  • Current models don't require massive amounts of (neg) tags to work, keep your prompts lean
  • Textual Inversions/Embeddings/Hypernetworks are not commonly used anymore and old SD1.5 based ones won't work with SDXL anyway
  • SD1.5 LoRas don't work with SDXL either and are hidden by the WebUI. Pony (PDXL) based LoRas may or may not work with Illustrious/noobAI based models, but many of those are being retrained, see here
  • PDXL score tags don't work on IllustriousXL/noobaiXL based models, you must/can use different quality tags, see here
  • These models support artist tags (again), see here
  • All SDXL based models need a higher basegen resolution than SD1.5, see here

While Pony Diffussion XL (PDXL) is still popular outside of sdg, the meta is currently shifting to IllustriousXL or rather noobaiXL based models as these provider better body coherence, updated knowledge of popular characters/series/concepts and support the uncensored usage of artist tags for unique style mixes

A list of currently popular sdg models including download links, recommendations and sampler comparisons can be found here:

An introductory guide on how to prompt and use noobaiXL based models can be found here:

What does VPRED and EPS mean?

Most PDXL based models were/are of the EPS type, which work well in pretty much in any circumstance. As the noobaiXL model also exists as a VPRED (V_PREDICTION) version (besides the EPS version), it recently became more popular to create VPRED merge models from it. VPRED models are technically slightly different to EPS models and must be supported by the UI you are using. Good news is that the popular UIs like (re)Forge and comfyUI support VPRED out of the box without the need for manual adjustments (see RescaleCFG) if you make sure you update them regularly.

Current noobaiXL based VPRED models do not need a yaml file to work.

If you want to use an older SD1.5 based VPRED model, the civitai page will usually mention the additional "config file" that you can download if you expand the files section on the right hand side. Place it in the same folder that you saved the safetensor model file!
yaml-1.png

My gens are fried, help!

Image Reason
fried1-jpg Image fries at the last generation step: Wrong VAE. See here
fried2-jpg Image randomly turns into noise: See here
fried3-jpg The model you are using is a VPRED model but either you are missing the corresponding yaml file or your UI cannot handle VPRED models properly. Update your UI, or check if you can enable support manually.

lightblue
⬆️⬆️⬆️To the top⬆️⬆️⬆️ lightblue

Why are there always additional chars in the backgrounds even though I specify (solo)?

These are lovingly called noob gremlins are an issue with most of the current models. Just tags alone can not always get rid of them. Either you have to edit them out with an image editor, or you can try the following suggestions

  • add weight (solo:1.3)
  • for custom chars, specify "antro <species>", e.g. "anthro fox" instead of just fox
  • add "feral <species>" in the neg prompt
  • empty surroundings in your gens will increase the risk of gremlins appearing, try to tag some objects fitting your scene to gen instead

How can I prevent inpaint halos?

noobaiXL has some inbuilt problems with being sensitive to inpaint halos. Here are some tips:

  • Use a non-ancestral sampler just for inpainting
  • Layer the inpainted image onto the original in an image editor and cut or smudge the halo away (example needed)
  • Increasing gamma before inpainting dark images:

One thing you can do, is put the first upscale pass into a photo editor and increase the gamma by, say, 1.4, then once you're done hit it with a reciprocal 0.7143 after, as the halos won't appear as often on the gamma adjusted image. Further after you eraser halos back down to the base upscale, you can go back to the photo editor and adjust the black level up by 1-3 to eliminate halos that are just on the cusp of visibility on black that might be overlapping with the subjects body or clothing.
Yet one more thing you can do is open up the soft inpainting tab, and move the difference contrast from 2 down to 1.5 or even 1. This may result in gens doing weird overlay layer effects but it's one of the largest differences I've had with the soft inpainting setting enabled.
One thing I forgot to mention: adjusting the gamma might have the effect of causing your subsequent inpaints to darken the image again, thus returning it to base gamma would end up with the subjects becoming darker than the base gen.
Still, I've found that 1.3-1.4 is a good value to start with, and haven't really come across too many problems involving turning subjects darker because of it.
And, when erasering down to the base later, use a big brush with 0% hardness. There will still be a bit of a glow from the halo but the hard edge will be removed this way making it much less noticeable.

What are the brackets [], parentheses () and other {}+ stuff for?

These are used by the (re)Forge advanced prompt syntax that is inherited from A1111, see here:

{} is only used by NovelAI and irrelevant for local generation, + and - is used by InvokeAI for weighting.

lightblue
⬆️⬆️⬆️To the top⬆️⬆️⬆️ lightblue

How do I use the BREAK statement? Can I define separate characters with it?

No!

SDXL models "consume" your tags in blocks of ~75 tokens at a time. If your prompt contains more tags/tokens, they get split into different blocks automatically. The BREAK statement does force the creation of a new block, filling the current one with empty tokens. But roughly speaking, the model does use all tokens from all blocks equally during the generation process, so forcing a new block manually cannot be used to reliably group your tags together to e.g. fully define a custom character. In current models the BREAK statement has very limited edge-case uses and it's recommended to not use it, with one exception:

The BREAK statement is used by the Regional Prompter extension to separate the regions, so if you find an example prompt that contains BREAK it might have been created with the help of that extension.

What CLIP SKIP do I need to use?

With up-to-date UIs you dont need to set clip skip manually. Tests have shown that current UIs set it automatically for SDXL based models, so setting it manually may either be ignored by the UI or lead to corrupt images.

ComfyUI sets clip skip to -2 by default for all SDXL models
https://github.com/comfyanonymous/ComfyUI/blob/b4d3652d88927a341f22a35252471562f1f25f1b/comfy/sdxl_clip.py#L45

reforge comparison shows that clip skip UI setting is ignored when loading a SDXL model

Which VAE do I need?

99% of current models come with a built-in VAE so there is rarely a need to download a separate one for normal operations. Leave it on "Automatic" in the webUIs and connect the VAE from the model load node in Comfy.

If for a specific reason you must use a separate VAE you can use either the standard SXDL one:
https://huggingface.co/stabilityai/sdxl-vae/resolve/main/sdxl_vae.safetensors?download=true
or if you run into VRAM limits you can try this recommended one:
https://civitai.com/models/140686

Still using SD1.5 based models? This one seems to be recommended (DO NOT USE FOR SDXL!):
https://civitai.com/models/276082

lightblue
⬆️⬆️⬆️To the top⬆️⬆️⬆️ lightblue

Sometimes i get just black or corrupted images when using (re)Forge

A common issue for that is the "emphasis mode", try setting it to "No norm":
width-height.jpg

What is the best upscaler for hiresfix or img2img?

Anything but latent, otherwise any current upscaler will be fine. Here is a comparison:
https://rentry.org/sdgguide#upscaler
While you can easily choose the upscaler for hiresfix in the UI, there is an option hidden in the settings menu for img2img as well:
width-height.jpg

How can I gen ...

... gryphons

I've found it pretty easy to go with (avian species, quadruped, feral, furry) and have it pop out a gryphon of the chosen species. (folded wings) lets one not deal with having large wing surfaces to inpaint as well.

... coherent text with text2img

You cant with the current popular models. Some text may be semi-reliably produced that is common in the training data, like "PLAP" or "sigh", some other things like "!" or "?" kinda work as well, but you cannot define complete sentences, the result will only be garbled text. Its best to not prompt for text or "speaking" and add it in afterwards with an image editor.

... a character model sheet with different views?

Simply using "model_sheet, turnaround, front view, side view" is enough to get a decent result. The best outcome is achieved without quality tag, In such cases, the chances of a successful generation increase, roughly 3 out of 5 times. Essentially, if you touch up the image in Photoshop—by aligning the key body parts to the same level—it can be used as a reference for 3D modeling.

lightblue
⬆️⬆️⬆️To the top⬆️⬆️⬆️ lightblue

How can I add text to a t-shirt so it flows with the fabric?

Photoshop

Save copy of base gen with cleaned shirt (gibberish etc. removed) as separate .psd
Apply gaussian blur at 2 strength to the copy, save
Add text or logos or whatever, adjust to fit shirt, blending mode to multiply if light colored shirt, screen or linear dodge if dark
Filter > distort > displace, values at 10-15
Use blurred separate file as displacement map
Make a copy of the blurred gen, clip it to the text, blending to linear dodge (add) and opacity 50%
Clip levels adjustment to the copy of blurred gen, fuck around with levels until it looks decent
Maybe add a layer mask to the text/logo and fade out edges if I feel like it
Done

Inkscape

I find Inkscape to be a better tool for complex text tasks, or even simple ones as I find Krita's text tool very bad.

Where is the current trashcollects_lora?

What is a Catbox?

Catbox is a free image and file sharing service
URL: https://catbox.moe/
Uploading images to 4chan or imgbox deletes the generation metadata from the image. Uploading it to catbox allows sharing this information with other people. Filesize limit is 200mb, if you have a larger file to share you can use https://litterbox.catbox.moe/ to upload a file up to 2gb in size which is stored for up to three days (you MUST set the expiration limit BEFORE starting the upload).

Catboxanon maintains a 4chanX extension script that allows to directly upload images to catbox and 4chan simultanously, as well as viewing metadata of these catboxed images in the browser
https://gist.github.com/catboxanon/ca46eb79ce55e3216aecab49d5c7a3fb
Follow the instructions in the GitHub to install it.

lightblue
⬆️⬆️⬆️To the top⬆️⬆️⬆️ lightblue

How can I add (hidden) options to my UI? Quicksettings (Forge/reForge)

quicksettings.jpg
In the UI tab "Settings" > "User Interface", you can add various settings that you want quick access to or even some that are otherwise hidden to the top of the UI. Click on the "[Info]" link to view a list of all available options.

Where can I get current Embeddings / Textual Inversions?

These are rarely used nowadays and the few that exist are of dubious use. Recommended to just ignore this for any SDXL based model.

Local video gens? what?

There has been some progress with getting local t2v and even i2v to work, but most video models available cover realistic and human training data. A few 2d loras have been appearing. You will need at least 12gb VRAM, but even then you will need to use quantized models with reduced quality. The quality/duration achievable with local gen is still vastly inferior to e.g. Kling.
A full section will be added to sdgguide once I had the time to test it, in the meantime another anon wrote this WAN image2video guide
https://rentry.org/wan21kjguide

I get strange localized artifacts in my gens

artifacts.png
If you see artifacts like these but the overall image is fine it might be caused by an incompatible model & lora combination. Try lowering the lora weight, but if that does not solve the issue it's likely the two cant be used together.

Where do I get a current danbooru & e621 combined tag autocomplete csv?

https://files.catbox.moe/05a9ep.csv

lightblue
⬆️⬆️⬆️To the top⬆️⬆️⬆️ lightblue

How to use the "send to workflow" feature for ComfyUI?

Q: I have several different workflows saved for things like inpainting, upscaling, and ControlNet, but the context menu only ever shows "send to current workflow". Is there some other place to save workflows that I'm missing?

you have to edit the pysssss.json file in ComfyUI\custom_nodes\comfyui-custom-scripts and add
"workflows": { "directory": "path\to\workflows\folder" }
the default path for saved workflows being "ComfyUI\user\default\workflows"
restart comfyui and it will work.

Can I extract tags from images that have no metadata?

Short answer: No. If the metadata has been fully removed from the image, which can happen e.g. by editing it in photoshop, there is no feasible way to get the original metadata back.

There are two more options:
If it is a PNG file there is a chance that the original creator has saved the metadata in the PNG alpha channels with an extension called StealthPNG. for (re)Forge install the extension https://github.com/neggles/sd-webui-stealth-pnginfo to be able to read it in the UIs PNG Info tab

In the end you can use ask the AI to describe the image for you. (re)Forge has two builtin tools hidden in the img2img tab with which you can ask CLIP or the deepBooru model to describe the image that you have loaded
interrogate.jpg
CLIP will give you a NLP description, while deepBooru will use danbooru tags. Both tools are very limited in functionality and tailored for realistic or anime images

If you are using ComfyUI there is a recommended custom node which will describe the image with e621 tags
https://github.com/loopyd/ComfyUI-FD-Tagger

Some unverified tip from another anon:

If you want something similiar for (re)Forge that uses E621 style tags
https://mega.nz/folder/UBxDgIyL#K9NJtrWTcvEQtoTl508KiA
Download "E621 Tagger extenstion.7z" and unpack it to your (re)Forge root folder. Do not use both E621Tagger v15 and E621Tagger v16 extensions, delete one extension!

Another one, but never uses e621 style. Has a lot of models to choose. The style will be like you use usual SDXL with some natural language https://github.com/pharmapsychotic/clip-interrogator-ext

https://github.com/jhc13/taggui

Why are my snow leopard gens always in winter?

Why do a lot of carrots appear in my gen when I use the carrot \(artist\) tag?

Some artist tags, concepts or characters can lead to tag bleed, meaning if their tag contains words that are themselves tags known to the model, the AI may want to fulfil your wish for more carrots in your image.

Examples:

  • snow leopard, - high chance of snow or winter
  • carrot \(artist\), - high change of additional carrots lying around
  • worm's-eye view, - high chance of worms appearing in the image

A workaround is to deemphasize the word in the tag, this gives the AI the hint that you e.g. don't want any actual carrots, but it can still interpret the full tag as the artist tag. It is not a totally reliable way depending on many factors but has been working well most of the time.

Example solutions:

  • (snow:0) leopard,
  • (carrot:0) \(artist\),
  • (worm:0)'s-eye view,

See https://rentry.org/sdgguide#weighting-attentionemphasis for more information about tag weighting.

When I interrupt generation on (re)Forge it takes a long time to finally stop, why?

On interrupt the image is still VAE-decoded by default which can take some time depending on various factors. Enable this option to skip the decode step
interrupt.png

lightblue
⬆️⬆️⬆️To the top⬆️⬆️⬆️ lightblue

How can I use the same model, lora, etc directories for (re)Forge, ComfyUI, InvokeAI?

If you have multiple instances of reForge and/or Forge installed, you can use a symlink to link all "models" directories together so all files in these folders exist only once on your hard drive. Be careful, any change to this folder in any location will reflect everywhere, as technically it's just one directory that is linked to multiple locations!

For this example we assume that you have a reForge installation in "C:\SD\stable-diffusion-webui-reForge" and want to use this as your base.

  • Step 1: rename the "models" folder in your secondary installation
    oneloc-1.png
  • Step 2: open a command prompt as administrator
    oneloc-2.png
  • Step 3: Type the following command, change paths according to your setup
    mklink /J C:\SD\stable-diffusion-webui-forge-portable\webui\models C:\SD\stable-diffusion-webui-reForge\models
    The first path is the new target that should be linked, second path is the existing master directory
    oneloc-3.png
  • Step 4: You will see that the directory is again visible but has a special icon in explorer
    oneloc-4.png

If you want to add ComfyUI to that master directory as well DO NOT use the mklink approach as comfyUI updates tend to remove the symlinks. Instead go to your comfyUI folder (inside the portable folder if you are using that) and look for a file called "extra_model_paths.yaml.example". Rename that file to just "extra_model_paths.yaml" and edit it with a text editor.
By default the first section "a111" is already active and so you only have to change the "base_path" to the directory of that contains your master installation. Make sure to replace backslash \ with forward slash /
oneloc-5.png

For adding InvokeAI, you can specify the "load in place" option when adding models to prevent the UI from copying them to its own directory.

What resolution/ratio should I use with SDXL based models?

https://rentry.org/sdgguide#resolution-width-height

Why does img2img/inpaint in (re)Forge do much less steps than I specify?

By default, the number of steps done is whatever you configure in the gui multiplied by the denoise value. So if you set it to do 20 steps and denoise is 0.3, it will do only 6 steps.

You can change this behaviour with the following setting
i2isteps-1.png

lightblue
⬆️⬆️⬆️To the top⬆️⬆️⬆️ lightblue
Xeno443

Edit Report
Pub: 21 Mar 2025 14:31 UTC
Edit: 30 Mar 2025 13:15 UTC
Views: 287