If you haven't seen it yet, there's a new model called HunyuanVideo that is by far the local SOTA video model: https://x.com/TXhunyuan/status/1863889762396049552#m

Our overloard kijai made a ComfyUi node that makes this feat possible in the first place.

How to install:

1) Go to the ComfyUI_windows_portable\ComfyUI\custom_nodes folder, open cmd and type this command:

git clone https://github.com/kijai/ComfyUI-HunyuanVideoWrapper

2) Go to the ComfyUI_windows_portable\update folder, open cmd and type those 4 commands:
1
2
3
4
5
6
7
..\python_embeded\python.exe -s -m pip install accelerate >= 1.1.1

..\python_embeded\python.exe -s -m pip install diffusers >= 0.31.0

..\python_embeded\python.exe -s -m pip install transformers >= 4.39.3

..\python_embeded\python.exe -s -m pip install ninja
3) Install those 2 custom nodes via ComfyUi manager:
4) SageAttention2 needs to be installed, first make sure you have a recent enough version of these packages first:
1
2
3
4
5
6
7
python>=3.9

torch>=2.3.0

CUDA>=12.4

triton>=3.0.0 (Look at 4a) and 4b) for its installation)

Personally I have python 3.11 + torch (2.5.1+cu124) + triton 3.1.0

4a) To install triton, download one of those wheels:**

If you have python 3.11: https://github.com/woct0rdho/triton-windows/releases/download/v3.1.0-windows.post5/triton-3.1.0-cp311-cp311-win_amd64.whl

If you have python 3.12: https://github.com/woct0rdho/triton-windows/releases/download/v3.1.0-windows.post5/triton-3.1.0-cp312-cp312-win_amd64.whl

Put the wheel on the ComfyUI_windows_portable\update folder

Go to the ComfyUI_windows_portable\update folder, open cmd and type this command:

..\python_embeded\python.exe -s -m pip install triton-3.1.0-cp311-cp311-win_amd64.whl

or

..\python_embeded\python.exe -s -m pip install triton-3.1.0-cp312-cp312-win_amd64.whl
4b) Triton still won't work if we don't do this:
  • Install python 3.11.9 (Or 3.12.X if you have that one) on your computer:
  • Go to C:\Users\Home\AppData\Local\Programs\Python\Python311 and copy the libs and include folders
  • Paste those folders onto ComfyUI_windows_portable\python_embeded
4c) Install cuda toolkit on your PC (must be Cuda >=12.4 and the version must be the same as the one that's associated with torch, you can see the torch+Cuda version on the cmd console when you lauch ComfyUi)

For example I have Cuda 12.4 so I'll go for this one: https://developer.nvidia.com/cuda-12-4-0-download-archive

cmd

4d) Install Microsoft Visual Studio (You need it to build wheels)
4e) Go to the ComfyUI_windows_portable folder, open cmd and type this command:

git clone https://github.com/thu-ml/SageAttention

4f) Go to the ComfyUI_windows_portable\SageAttention\csrc folder, and open up the math.cuh file with a Notepad or with Visual Studio Code

On the lines 71 and 146, replace "ushort" with "unsigned short" and save the file.

4g) Go to the ComfyUI_windows_portable\SageAttention folder, open cmd and type this command:
..\python_embeded\python.exe setup.py install

Congrats, you just installed SageAttention2 onto your python packages.

5) Go to the D:\ComfyUI_windows_portable\ComfyUI\models\vae folder and create a new folder called "hyvid"

Download the Vae and put it on the D:\ComfyUI_windows_portable\ComfyUI\models\vae\hyvid folder

6) Go to the D:\ComfyUI_windows_portable\ComfyUI\models\diffusion_models folder and create a new folder called "hyvideo"

Download the Hunyuan Video model and put it on the ComfyUI_windows_portable\ComfyUI\models\diffusion_models\hyvideo folder

7) Go to the ComfyUI_windows_portable\ComfyUI\models folder and create a new folder called "LLM"

Go to the ComfyUI_windows_portable\ComfyUI\models\LLM folder and create a new folder called "llava-llama-3-8b-text-encoder-tokenizer"

Download all the files from there and put them on the ComfyUI_windows_portable\ComfyUI\models\LLM\llava-llama-3-8b-text-encoder-tokenizer folder

8) Go to the ComfyUI_windows_portable\ComfyUI\models\clip folder and create a new folder called "clip-vit-large-patch14"

Download all the files from there (except flax_model.msgpack, pytorch_model.bin and tf_model.h5) and put them on the ComfyUI_windows_portable\ComfyUI\models\clip\clip-vit-large-patch14 folder.

table

And there you have it, now you'll be able to enjoy this model, it works the best at those recommanded resolutions

For a 24gb vram card, the best you can go is 544x960 at 97 frames (4 seconds).

Mario in a noir style.

I provide you a workflow of that video if you're interested aswell: https://files.catbox.moe/kau0by.webm

Edit
Pub: 05 Dec 2024 20:36 UTC
Edit: 05 Dec 2024 20:52 UTC
Views: 5005