SDExp's img2img Guide

Fanbox Pixiv

申し訳ありませんが、私は日本語が上手ではないので、代わりにこのウェブリンクガイドを提供します。https://gigazine.net/news/20220913-automatic1111-stable-diffusion-webui-img2img/

Work in Progress 仕掛かり品

img2img

What is img2img?

img2img or image-to-image directs the AI to create art based on another image.

You can access it via Stable Horde or WebUI by clicking the appropriate tab (for install instructions, please see my previous guide or this guide to WebUI).
How to access img2img

To use img2img, you upload an image you would like the AI to change, then describe what you would like to see in the prompt.
It will be familiar to anyone who has used txt2img before (making AI art using just prompts). The saying "trash in trash out" applies here. If your original image is really bad (like what you will see in some examples) it will be harder to get something looking nice, without losing aspects of the original image.
Color and shape is important. It is much easier to guide the AI with rough edits of the original image. You can edit the image in paint or the sketch tool in WebUI.

Denoising Strength (Init Strength)

The most important setting in img2img is "Denoising Strength" (or Init Strength).
This will determine how much of the original image the AI will keep. The higher the number, the more the AI will change your original image. Then you will describe your image in the prompt to guide the AI
Denoise Typically you will want to use the same dimensions as your original image, with the CFG or guidance set from 7-14. All the other settings should be familiar if you have used txt2img to make AI art using prompts.

Resize mode

This allows you to resize the new image if you want. Typically you will want to keep the dimensions of the image the same.

This covers the basics, but where img2img really shines is in a processes known as "inpainting"

Inpainting

Very limited inpainting support on Stable Horde. WebUI via local or cloud install is recommended.

What is inpainting?

If you are familiar with the "content-aware fill" in photoshop, it is like that on AI steroids.
Inpainting allows you to only change certain parts of an image using img2img. This is done by "painting" a region to be changed by the AI. This painted region is referred to as a mask. There are lots of additional settings and features unique to inpainting, so lets start simple!

Masks

Masking the face to change it Access the inpainting UI (its next to the img2img tabs), and upload your image. You should now be able to draw on your original image to create a mask.
In the example above, I made a quick mask of the original face to change it based on my ahegao LoRA (which you can download here😊).
It is important that your mask is large enough to cover what you want to change, yet avoids things you do not wish to change.
Take another look at the gif above, the mask is actually too small on the bottom, cutting off the tongue! Also the mask clips part of the ear, which arguably makes that part of the image look worse.

Another thing to keep in mind is that the AI works better with more context. If you only wanted to change, for example, a single eye; I would still mask both eyes including the forehead in-between to get more consistent results.

Inpainting Settings

Here are the settings that I like to use as a baseline when inpainting.
We will go over a couple of the settings and what they do, as usually inpainting requires some tweaking.
Good baseline settings

Resize mode and Resolution

This should almost always be on "resize and fill" and 512x512 or 768x768.
Basically we are creating an image only in the mask area and then plopping it in the original image, so the resolution does not need to match the original image.

Mask Blur

This will blur what's underneath the mask based on the settings chosen in masked content.

Mask Mode

You can choose to reverse the mask, and mask what you don't want changed by changing this setting.

Masked Content

This changes what the AI sees under the mask.
99.99% of the time this should be left on original unless you really hate the original image. If you are curious this is what latent noise and latent nothing looks like.

image2image Workflow Examples

Ok that's the basics, let's see some examples.

Hands are usually the most fucked up parts of an AI generated image. To fix this you can either inpaint for hundreds of generations, or do a simple edit via photoshop/paint. I may make a guide going in more depth in editing later.

Here's a quick image that we spat out in txt2img. Let's inpaint the face and hands as an example:

Let's do the face first. I draw a mask like this, type in "looking at viewer, blush, grey eyes," use the baseline settings and generate:

Okay, not the expression I was looking for. Let's give the AI more context. Increase the mask size, and bump up the mask padding size, then generate:

Tweaking the prompt. Add "embarrassed" as a prompt, and "shine" as a negative prompt:

Too much embarrassment. Change prompt to "(embarrassed:0.6)" and lower the denoising strength to 0.47:

And bumping denoising strength up to 0.55:

Not exactly what I envisioned but it looks good.

Now for the hands. Hands are famously what the AI struggles with the most, and I don't feel like generating loads of inpainted hands on my poor 1080ti. For now I will do a literal 2 second edit using paint. Stay tuned for a more in-depth guide on simple editing via photoshop:

Looks pretty awful, hopefully image2image can save us:

Okay, admittedly its not great, but I only ran img2img 4 times at 0.47 denoise, and it doesn't look too off at first glance.

Final image:

Troubleshooting Guidelines

If new inpaint has too much changed from the original. Scale down the "denoising strength".

!IMAGES HERE

If new inpaint is too sharp or a different style compared to the rest of the image. Increase the "only masked padding".

!IMAGES HERE

For a img2img guide
Sometimes you need to tweak and generate lots! Of course, img2img can be used for more than just refining anime images if you use a different model. I made these examples very quickly, you can get much better results than this!
example

Edit
Pub: 04 Feb 2023 09:52 UTC
Edit: 16 Feb 2023 23:09 UTC
Views: 739