Training Chroma using diffusion-pipe on linux (Install WSL2 on Winblows for Linux support)
Notes
- Eval datasets are the same format as training. Copy 1-4 image/text pairs from your training into eval. Normally you don't poison eval with training data, but we're intentionally biasing the model here.
- I put all my configs in the examples folder (out of laziness)
- Losses around 0.2-0.4 are normal
Installation
Dataset
- Put as many good images as you want in a folder (more isn't always better).
- Captions go in the same file name with a .txt extension (For Winblows, google show file extensions). Example: image1.jpg and image1.txt
Config
examples/chroma.toml
examples/chromadataset.toml which is the same as examples/evaldataset.toml except the path
Running
NCCL_P2P_DISABLE="1" NCCL_IB_DISABLE="1" deepspeed --num_gpus=1 train.py --deepspeed --config examples/chroma.toml