I have a machine with a Quadro P5000 graphics card, running Windows 10. I'd like to train a TTS voice on this system. What do I need to install to make this work?
Asked
Active
Viewed 4.1k times
1 Answers
27
Here's what to install/do:
- Download and install Python 3.8 (not 3.9+) for Windows. During the installation, ensure that you:
- Opt to install it for all users.
- Opt to add Python to the PATH.
- Download and install CUDA Toolkit 10.1 (not 11.0+).
- Download "cuDNN v7.6.5 (November 5th, 2019), for CUDA 10.1" (not cuDNN v8+), extract it, and then copy what's inside the
cuda
folder intoC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1
. - Download the latest 64-bit version of eSpeak NG (no version constraints :-) ).
- Download the latest 64-bit version of Git for Windows (no version constraints :-) ).
- Open a PowerShell prompt to a folder where you'd like to install Coqui TTS.
- Run
git clone https://github.com/coqui-ai/TTS.git
. - Run
cd TTS
. - Run
python -m venv .
. - Run
.\Scripts\pip install -e .
. - Run the following command (this differs from the command you get from the PyTorch website because of a known issue):
.\Scripts\pip install torch==1.8.0+cu101 torchvision==0.9.0+cu101 torchaudio===0.8.0 -f https://download.pytorch.org/whl/torch_stable.html
- Put the following into a script called "test_cuda.py" in the
TTS
folder:
import torch
x = torch.rand(5, 3)
print(x)
print(torch.cuda.is_available())
- Run the script via
.\Scripts\python ./test_cuda.py
and confirm the output looks like this (the first part should have just random numbers, but the last line must readTrue
; if it does not, CUDA is not installed properly):
tensor([[0.2141, 0.7808, 0.9298],
[0.3107, 0.8569, 0.9562],
[0.2878, 0.7515, 0.5547],
[0.5007, 0.6904, 0.4136],
[0.2443, 0.4158, 0.4245]])
True
- Put the following into a script called "train.bat" in the
TTS
folder, and then customize it for your configuration file:
set PYTHONIOENCODING=UTF-8
set PYTHONLEGACYWINDOWSSTDIO=UTF-8
set PHONEMIZER_ESPEAK_PATH=C:/Program Files/eSpeak NG/espeak-ng.exe
.\Scripts\python.exe ./TTS/bin/train_tacotron.py --config_path "C:/path/to/your/config.json"
- Run the script via
.\train.bat
.
If you are using a different model than Tacotron or need to pass other parameters into the training script, feel free to further customize train.bat
.
If you are just getting started with TTS training in general, take a peek at How do I get started training a custom voice model with Mozilla TTS on Ubuntu 20.04?.

GuyPaddock
- 2,233
- 2
- 23
- 27
-
1If you get "UnicodeEncodeError: ‘charmap’ codec can’t encode characters in position : character maps to
" during training, you may need to apply changes from https://github.com/coqui-ai/TTS/pull/394 – GuyPaddock Mar 20 '21 at 20:52 -
How are you supposed to get this working on RTX cards then that are CUDA 11? – Skyler Feb 13 '23 at 07:01
-
1i had to additionally do `.\Scripts\pip install networkx==2.8.8` because gruut requires networkx <3 and by the given command above by default, networkx was installed in version 3 – para Mar 18 '23 at 14:02
-
1I have followed these steps very carefully, but I'm still getting False to CUDA being available. I've moved the folder as well as installed the CUDA toolkit. My GPU is a 1070 Ti if that matters. Any idea? – Tessa Painter Mar 26 '23 at 01:29
-
1Any chance this could get an update? There are too many conflicting versions with packages now. – BrunoLM Jul 08 '23 at 12:21
-
I haven't been doing anything with TTS since 2021 and I have an older card so I don't have the context for the updates, but if someone has updates to suggest (or even a video) I would be happy to update these steps. – GuyPaddock Jul 21 '23 at 14:23