Installing on windows and run voice infill:

Step 0: Git clone and download the models
git clone https://github.com/jasonppy/VoiceCraft
cd VoiceCraft

https://huggingface.co/pyp1/VoiceCraft/tree/main

Step 1: Update conda and install, set solver to libmamba
https://www.anaconda.com/blog/conda-is-fast-now

Step 2: Create env

conda create -n voicecraft python=3.9.16
conda activate voicecraft

Step 3: Install ffmpeg and espeak-ng, add paths to path, make sure you can call it in the cmd
https://bootphon.github.io/phonemizer/install.html

Step 4: Install in order:

1
2
3
4
5
6
7
8
9
conda install ipykernel
conda install -c conda-forge montreal-forced-aligner=2.2.17 openfst=1.8.2 kaldi=5.5.1068
pip install phonemizer==3.2.1

pip install -e git+https://github.com/facebookresearch/audiocraft.git@c5157b5bf14bf83449c17ea1eeb66c19fb4bc7f0#egg=audiocraft
pip install -U torch torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install -U xformers --index-url https://download.pytorch.org/whl/cu118

pip install ipywidgets

Step 5: Download the MFA model

mfa model download acoustic english_us_arpa
mfa model download dictionary english_us_arpa

when running, it will scream "no triton installed". This is apparently fine https://github.com/invoke-ai/InvokeAI/issues/2611

also just pip install missing libraries when needed. All needfuls must be done

Edit Report
Pub: 31 Mar 2024 01:19 UTC
Views: 294