AMD LoRA Training Guide

Written by Anon, mainly a copy paste from https://rentry.org/lora_train, but adjusted to work on AMD+Linux

Limitations

This guide should work on

  • Radeon VII (Vega 20)
  • 6700XT up to 6950XT

and might work on:

  • 6700

with some VRAM tweaking

Prerequisites

You will need:

  • A Linux installation (I'm using Ubuntu 22.04)
  • Python 3.10
  • git

Installation

  1. Open a terminal and navigate where you want to install it.
  2. Download the repo:
    git clone https://github.com/kohya-ss/sd-scripts
    
  3. Installation instructions (different from standard/NVIDIA):

    1
    2
    3
    4
    5
    6
    7
    8
    cd sd-scripts
    python3 -m venv venv
    source venv/bin/activate
    
    pip3 install torch torchvision --extra-index-url https://download.pytorch.org/whl/rocm5.2
    pip3 install --upgrade -r requirements.txt
    pip3 uninstall tensorflow
    pip3 install tensorflow-rocm
    

    In exactly this order

  4. Grab one of the two python training scripts in https://github.com/derrian-distro/LoRA_Easy_Training_Scripts, If you have no idea what you're doing you should grab the lora_train_popup.py version as it basically walks you through everything.
  5. Edit the lora_train_popup.py with the following:

    self.xformers: bool = False
    

    sadly we cannot use xformers because facebookresearch targets Windows+NVIDIA

  6. Either you set self.use_8bit_adam: bool = False or follow the next optional step "building bitsandbytes for AMD"
  7. Then prepare your dataset.

Building bitsandbytes for AMD (optional/read above!)

  1. Uninstall the current bitsandbytes
    pip uninstall bitsandbytes
    
  2. Get this unofficial AMD fork of bitsandbytes
    git clone https://github.com/broncotc/bitsandbytes-rocm
    cd bitsandbytes-rocm
    
  3. Build it with

    make hip
    

    if it fails you might need to install:

    1
    2
    3
    make
    libstdc++-12-dev
    amdgpu-install --usecase=rocm,hiplibsdk
    
  4. On success install bitsandbytes with:
    python3 setup.py install
    
  5. Now you can exit bitsandbytes-rocm with cd .. and prepare your dataset

Dataset preperation (shameless copy paste)

Create a directory layout as shown below:

  • Example directory layout: https://mega.nz/folder/p5d3haJR#SmDSpaldBGcYzvZOx8sqbg
  • You could have one concept subfolder or 10 but you must have at least one.
  • Concept folders follow this format: <number>_<name>
    • the <number> determines the amount of repeats your training script will do on that folder.
    • the <name> is purely cosmetic as long as you have matching txt caption files.
    • Caption files are mandatory or else LoRAs will train using the concept name as a caption.
  • Learn how to do captions here

Start training

  1. Open a terminal and navigate to your sd-scripts folder OR just reuse the terminal from the installation
  2. Make sure your venv is active or just:
    source venv/bin/activate
    
  3. Launch it with
    venv/bin/accelerate-launch --num_cpu_threads_per_process 12 lora_train_popup.py
    
  4. Follow the popups and make sure to select the upper folder, which holds the <number>_<name> subfolders when asked.

Tips and tweaks

  1. 512x512 Training with batch size 1 requires 10GB VRAM on my system + 1GB to other apps.
  2. Batch size 2 or 768x768 do not fit in 16GB VRAM.
  3. If you have an IGPU you can actually run all other apps on it by following this: Linux + IGPU + ANY DGPU
  4. Ignore the MIOpen(HIP): Warning [SQLiteBase] Missing system database file: gfx1030_40.kdb Performance may degrademessage. There is no solution and the performance loss is only on the startup.
  5. View VRAM usage with radeontop in another terminal
  6. This works sometimes: AMD GPU Linux Overclocking
Edit Report
Pub: 17 Jan 2023 15:33 UTC
Edit: 19 Jan 2023 19:25 UTC
Views: 2098