how to run hunyuanvideo on an a40 pod on runpod

setup runpod

put $10 into runpod and spin up a pytorch 2.1.1 A40 instance (don’t do spot).
- 80GB container + 80GB storage = 160GB total
ssh into your pod once it's ready.

install miniconda

⎗

mkdir -p ~/miniconda3
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda3/miniconda.sh
bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3
rm ~/miniconda3/miniconda.sh
source ~/miniconda3/bin/activate

clone repo & set up env

⎗

git clone https://github.com/tencent/HunyuanVideo
cd HunyuanVideo
conda env create -f environment.yml
conda activate HunyuanVideo

download model in background

⎗

1 2	python -m pip install "huggingface_hub[cli]" huggingface-cli download tencent/HunyuanVideo --local-dir ./ckpts &

install requirements

⎗

1 2	python -m pip install -r requirements.txt python -m pip install git+https://github.com/Dao-AILab/flash-attention.git@v2.5.9.post1

download llava mllm text encoder

⎗

1 2	cd HunyuanVideo/ckpts huggingface-cli download xtuner/llava-llama-3-8b-v1_1-transformers --local-dir ./llava-llama-3-8b-v1_1-transformers

⎗

1 2	cd HunyuanVideo python hyvideo/utils/preprocess_text_encoder_tokenizer_utils.py --input_dir ckpts/llava-llama-3-8b-v1_1-transformers --output_dir ckpts/text_encoder

download clip text encoder

⎗

1 2	cd HunyuanVideo/ckpts huggingface-cli download openai/clip-vit-large-patch14 --local-dir ./text_encoder_2

run the damn thing

python3 sample_video.py --video-size 360 640 --video-length 65 --infer-steps 30 --prompt "a cat is running, realistic." --flow-reverse --seed 0 --use-cpu-offload --save-path ./results

profit

done. you'll get your video in ./results.