TRUE GRID - BREAK check
Welcome
One of the great mysteries of SDXL based image generation is if using the BREAK statement (or concated conditionings) help to clearly separate characters in mutlti-char gens. So here are 64 same seed same prompt generations with and without using BREAK.
Introduction
To find out if using the BREAK statement makes any tangible difference when trying to generate two custom characters I used a medium-sized prompt and ran it 64 times without and again with two BREAK statements on the same seeds. Changing the prompt in any way will always induce a bit of change no matter what, so the images will never be 100% the same. But the task is to see if using BREAK lessens the tag bleed on average or not.
The BREAK statement
SDXL models text encoder CLIP "consumes" your tags in blocks of ~75 tokens at a time. If your prompt contains more tags/tokens, they get split into different blocks automatically. The BREAK statement does force the creation of a new block, filling the current one with empty tokens. Another related myth is that "the first block is more important than the next one", in reality it has a minimal effect on your average gen.
Setup
Character 1:
Female spotted hyena with blue eyes and long black hair, wearing a blue crop top, brown miniskirt and earrings
Character 2:
Male horse with green eyes, brown hair, brown body fur and a scar across one eye, wearing a white shirt, jeans, a red bandana and fingerless gloves, with sunglasses on the head
All tags were grouped together for each char and one block for the common tags describing the scene. For the testing, two BREAK statements were added after the block of common tags and after the first block of character 1.
Generation information
Type | Data |
---|---|
Checkpoint | https://huggingface.co/Xeno443/3wolfMondAI-SDG |
Sampler | Euler A |
Scheduler | SGM Uniform |
Steps | 28 |
CFG | 4.0 |
Resolution | 832 x 1216 |
RescaleCFG | 0.7 |
Hires.Steps | 20 |
Hires.Scale | 1.5 |
Hires.Denoise | 0.3 |
Prompt
masterpiece, best quality, newest,
(furry, anthro,) furgonomics,
duo, male/female, hug, full-length portrait, full body, eye contact, gradient background, grass,
<BREAK>
spotted hyena, canine, female, medium breasts, blue eyes, tan fur, long hair, black hair, blue crop top, brown miniskirt, smile, parted lips, earrings,
<BREAK>
horse, equine, male, jeans, white shirt, red bandana, fingerless gloves, green eyes, open smile, scar across eye, sunglasses on head, brown hair, brown fur,
Negative Prompt
worst quality, human hands,
Conclusions
From the overall result its evident that some tags have a strong bias towards others. For example if there is a female in the resulting image, the skirt and crop top have a near 100% chance to render on that character, while the jeans will render on the male. Earrings is less certain but also preferrably appear on the female character, etc. For other tags like the sunglasses and the bandana it is much more random which character (or both) will receive them.
Overall this test shows that using BREAK does not improve the association of tags to specific characters, and as such it is recommended to not use them to prevent unnecessary token bloat.
Set1
⬆️⬆️⬆️To the top⬆️⬆️⬆️
Set2
⬆️⬆️⬆️To the top⬆️⬆️⬆️
Set3
⬆️⬆️⬆️To the top⬆️⬆️⬆️
Set4
⬆️⬆️⬆️To the top⬆️⬆️⬆️
Set5
⬆️⬆️⬆️To the top⬆️⬆️⬆️
Set6
⬆️⬆️⬆️To the top⬆️⬆️⬆️
Set7
⬆️⬆️⬆️To the top⬆️⬆️⬆️
Set8
⬆️⬆️⬆️To the top⬆️⬆️⬆️
Xeno443