Resonance Cascade

News πŸ”— | Download | Artist Comparisons | Using Resonance | b-'s Training LoRA πŸ”— | Tips Discord πŸ”—


What is Resonance?

Jordach's original rentry: https://rentry.org/resonancemodel
Resonance is a Stable Cascade based fine-tune combining anime and furry, since everyone else seems to screw up model training in some manner.

Full models have begun training.
Currently available models are the prototypes. Used to fine-tune the training and demonstrate Wurstchen (Cascade) capabilities.

Name Dataset
Rev1 3.69m e621, 2.45m gelbooru, ~50k derpibooru, some niji (See Prompting for tag list)
epsilon-epoch5 ~440,000 dataset. Mix of e621 and gelbooru (See Prompting for tag list)

Support the model

Resonance R1 (with Lite/1B and 3.6B versions) is ready to begin training, but requires donations to speed up training - instead of six months, it takes weeks.

All funds raised across all channels will be used to fund infrastructure rental to bring you the best version of Resonance, and help cover costs associated with ensuring free public access to the model (and generations, without involving CivitAI) remains available to anyone who wishes to use it.

Fundraising Campaign (USD)

Currently raised: $8,803.34 Fundraising STRETCH goals added, see fundraising page for milestones and details

Other ways to support

Kofi | LiberaPay | PayPal

Model Milestones

  • Full Dataset Latent Creation - Funded - Complete
  • Text Encoder Pretraining - Funded - In Progress
  • 1B model 5 epoch Release Candidate - Funded - In Progress
  • Due to some changes in training hardware, training takes slightly longer. Partials will be released as a result.
  • 1B model 10 epoch Final Release - Funded
  • 3.6B model 5 Epoch Release Candidate - Funded
  • 3.6 model 10 Epoch Final Release - Not Yet Funded

Download

If you are using CAPGUI (see " Using Resonance" section), models are downloaded during installation.

Stable Cascade models come in 3 pieces: Stage A, Stage B, Stage C. Stage B and Stage C have Lite 1B or Full 3.6B versions.

A "Lite" model is a cut down 1B parameter version of the 3.6B Stage C and is intended for lower performance and memory capacity hardware.
Models with "Lite" somewhere in the name will indicate that it is a 1B model. You can also identify whether the model is "Lite" or not by observing whether the model is 2GB for 16 bit values, or 4GB for 32 bit values.

LoRAs must also be recreated for Lite and the Full 3.6B Stage C model separately. Stage C (Lite) cannot be used with official ControlNets. The official ControlNets are published are for the 3.6B model, which will work with Stage C Full.

Combined Safetensor

Coming Soon...

Individual Stages

* Partial release of in between epochs. Will be swapped for the full epoch when training completed.

Download links will be updated to most current releases.

Stage A ft hq is a version of WΓΌrstchen's Stage A that was finetuned to have slightly-nicer-looking textures. https://huggingface.co/madebyollin/stage-a-ft-hq


Artist Comparison

https://mega.nz/folder/xQFkBBYK#4hArFLTLbliidMIT3ipj_w (Courtesy of Machina)

  • epsilon-epoch5 prototype, top 4000 tags by image count, may contain NSFW

https://mega.nz/folder/JGMmTAJK#N_98VLXbeJVGNB5DBgNa4g (Courtesy of Tigerlith)

  • Artists from e621(furry) dataset. epsilon-epoch5 prototype, every artist with 5 or more images

https://mega.nz/folder/9fdSGa4L#rE1hrTLY1zE3rXltI9yXXQ (Courtesy of Tigerlith)

  • Artists from e621(furry) dataset. 1B 50.e6 model

Using Resonance

b-'s How to install/use the Cascade finetune Prototype For Smoothbrains

CAPGUI

https://github.com/Jordach/CAPGUI
GUI that resembles Auto1111/Forge and tries to preserve as many habits and provide a smoother user experience.
Also the front end for the aforementioned free generation infrastructure.
Uses ComfyUI as a backend, and in heavy development:
CAPGUI

Installation

CAPGUI uses ComfyUI as backend. A base ComfyUI install is needed.

CAPGUI can be installed on top of an existing ComfyUI installation, or you can install a separate ComfyUI installation just for CAPGUI. Using a separate ComfyUI installation is recommeneded as CAPGUI is still being developed and may break an existing ComfyUI installation.

Installing ComfyUI

  1. Go to https://github.com/comfyanonymous/ComfyUI#Installing and follow the instructions for your hardware and OS.

Installing CAPGUI

  1. Make a folder where you want the CAPGUI installation to go.
  2. Go into the directory and clone CAPGUI with git clone https://github.com/Jordach/CAPGUI.git.
  3. Start ComfyUI. This is needed later to test the webhook.
  4. Start the installation, run start_comfy_installer.bat. It will open a console and start installing CAPGUI dependencies.
    Setup Console
  5. After dependencies install, a new browser window (in your default browser) will open. It should look like:
    Setup
  6. Fill in your ComfyUI install absolute path, ComfyUI address and port, the click the Test ComfyUI Path and WebSocket button. If you didn't mess with the address and ports of ComfyUI, the default should be fine. If you are unsure, launch ComfyUI and see.

Make sure ComfyUI is also running before clicking the test button

  • If you enter one of them incorrectly, it will show:
    Setup Error
  1. On successful test, you will be presented with model download options:
    Setup Success
  2. To get CAPGUI working with Resonance, you need the following:
    • Resonance Prototype Delta + Epsilon
    • Refiner Lite OR Refiner Large
    • Encoder/Decoder Models
  3. Make your selection and click Start Download. You should receive an info box on the top-right corner that download has started. You can also check the console from Step 5 to see download progress:
    Downloading Models

Resonance models will be slower to download than Stage B and Stage C. In total it took me around 10 minutes to complete the downloads.

  1. When downloading is complete, the console will look like:
    Download Complete
  2. Close the console. Also close ComfyUI console. ComfyUI needs to restart to install dependencies.

Running CAPGUI

  1. Start ComfyUI.

If you don't start ComfyUI first, you will get ConnectionRefusedError: [WinError 10061] No connection could be made because the target machine actively refused it.

  1. Start CAPGUI with start_gui.bat in the CAPGUI folder. A console will open up and start CAPGUI.
    CAP Start

If 0.0.0.0:6969 cannot connect, try using localhost:6969.

Updating CAPGUI

  1. Go to the CAPGUI folder. Run git pull to get the latest changes.
  2. Run update_comfy_nodes.bat and update_requirements.bat.

Manually Adding Models to CAPGUI
If you download new Resonance models you want to manually add to your existing CAPGUI installation, move the files from the following table to their corresponding folder.

Base Model (Stage C) Text Model (Stage C)
ComfyUI/models/unet/cascade/stage_c/ ComfyUI/models/clip/cascade/
Stage B Stage A
ComfyUI/models/unet/cascade/stage_b/ ComfyUI/models/vae/cascade/

Setting Models

Base Model Text Model Refiner Model
Stage C UNET Stage C TE, filename ending in _te Stage B

Set Models

Using Stage A ft hq in CAP (enabled by default)
Stage A

Parameters

Parameter (default value) Description
Base Steps (20)
Base CFG (4) Classifier-free guidance. Lower = more creative, Higher = closer to prompts. Too high = Fried.
Batch Size (1) Number of images it will generate simultaneously. Too high will result in "Out of VRAM", influenced by image Height and Width.
Select From Batch (1)
Save Generated Image (True)
Base Seed (-1) Seed used to generate image. -1 = random.
Width (1024) Generated Image Width
Height (1024) Generated Image Height
"Shift" (2) This value "shifts" the denoising process of Stage C, which can somewhat create seed variations.
Compression (42, Auto) The recommended default compression factor is 42. Automatic Compression Finder will find the compression factor that results in the best quality for your resolution. When compression reaches 80, higher resolutions can become unstable.
Automatic Compression (True) Automatically adjust the compression value when changing Width or Height sliders.

Prompting

Resonance is trained on tags rather than natural language. Use the tag list below to see learned tags. Third number in each column is the number of images which had that tag in the training data. More examples images = better learned tag (usually).

Anime Tags (gelbooru) Furry Tags (e621)
Prototype Anime CSV Prototype Furry CSV
Rev1 Anime CSV Rev1 Anime CSV

See, Tips for more model specific prompting tips.

All underscores _ have been removed in the tags. Don't use underscores when prompting. βœ… best quality ❌ best_quality

Ctrl + Enter hotkey to "Generate".

Invoking a style

Styles are invoked with the by keyword followed by an artist name. i.e. by your mom. List of available names are in the tag list mentioned above.

Prompt Weight

Prompt weight interpretation is added by the "Advanced CLIP Text Encode" ComfyUI plugin.
Prompt weight interpretation can be changed using STYLE("option") at the beginning of the positive prompt. i.e. STYLE(A1111) 1girl, machine gun face. Default weight interpretation is ComfyUI option. ComfyUI weights are much stronger than A1111, so if you get fried images with prompt weights, try changing to another weight interpretation.
STYLE(weight_interpretation, normalization)

The interpretation options are:

Auto1111 ComfyUI Comfy++
Style(A1111) Style(Comfy) Style(Comfy++)

For explanation on how different options affect weight interpretation, see: Advanced CLIP Text Encode

Prompt control uses the ComfyUI Prompt Control plugin. For a detailed explanation of the different control syntax, see https://github.com/asagi4/comfyui-prompt-control.

Prompt Scheduling

[start_prompt:end_prompt:switch_step]
Starts with start_prompt until switch_step, at which time it switches to end_prompt.
switch_step can be integer number (e.g. 10, change at step 10). Or decimal (e.g. 0.5, change at 50% of the total steps).

Sequencing

[SEQ:tag_1:switch_1:tag_2:switch_2:tag_3:switch_3] is shorthand for [tag_1:[tag_2:[tag_3::switch_3]:switch_2]:switch_1] switching tag_1 -> tag_2 -> tag_3 at each defined switch steps.

LoRA Prompting

Loras can be loaded in similar fashion to A1111 with the syntax <lora:cats:1>, which will load cats.safetensors or sdxl/lora/cats.safetensors. In the case of two loras with the same name, the first match will be loaded.
Full directory path relative to ComfyUI's search paths can also be used <lora:XL/cats:1>.
When no matches are found, it will replace all spaces with underscores and attempt again. e.g. <lora:a b:1> not found, trys again with <lora:a_b:1>.

Regional Prompting

Currently slight bug with Cascade, CLIP input into Stage B is same as Stage C causing the area to be divided twice. Workaround for ComfyUI is to use feed Stage B with a separate conditioning without regional prompt syntax.

Alternating

[tag_1|tag_2:percent_step] will alternate between tag_1 and tag_2 for every percent_step of total steps. percent_step defaults to 0.1 (10%) if omitted. More than two tags can be alternated e.g. [tag_1|tag_2|tag_3].

Combining Prompts

concept_1 AND concept_2

Interpolation
BREAK

Although BREAK in ComfyUI Prompt Control did not work for SDXL, it has been bug fixed to work in Cascade.

Pad the remaining tokens (per 75) with empty.
e.g. by novelai, by artist 1 BREAK 1girl, female.

ComfyUI

Sample workflow WIP

Tips

  • Improved eye highlights or pupils with empty eyes, no pupils in negative. Logic is empty eyes have no highlights.
  • Negative: traditional media \(artwork\)
  • Tag weights seem to be applied on the trailing comma rather than the tokens
    • Always end tokens with ,. e.g. 1girl, by xyz,
  • Ctrl + Enter hotkey to "Generate".
  • by novelai improves the quality and anatomy when using the prototype models because it has a large image count.
    • by novelai style has a heavy AOM look. To get a flatter anime style, either use scheduling (e.g. [by novelai::0.5]) or add 3d \(artwork\) to negative (credits Tigerlith).
  • If you get an image you like with only minor issues, use the "Shift" parameter to get subtle variations on the seed to try to improve the image.
Edit Report
Pub: 27 Apr 2022 06:13 UTC
Edit: 04 Jul 2024 12:02 UTC
Views: 2087