I want festival tts to read a bit slower, can anyone help me with that? I use python 2.7 and I run the code in gnome-terminal.
-
is your problem specific to python? or are you just trying to use festival for your own needs and you'd like a slower speed? – Tasos Papastylianou Jul 25 '16 at 16:21
-
I am developing a program and I need it to say the text a bit slow, so its for my own needs. @TasosPapastylianou – ConfidingOz Jul 27 '16 at 10:22
-
Ah, no idea then sorry. I was just going to suggest alternative readers whose options I know (and which I find to be much better quality), so I was just checking just in case it was a case of the [XY Problem](http://xyproblem.info/) :p – Tasos Papastylianou Jul 27 '16 at 10:46
4 Answers
What does your ~/.festivalrc
look like? To use festival with ALSA, I have:
(Parameter.set 'Audio_Method 'Audio_Command)
(Parameter.set 'Audio_Command "aplay -Dplug:default -f S16_LE -r 15000 $FILE")
Using aplay
, the rate of playback is determined by the value after the -r
flag, which you can increase to make it speak faster, or decrease to make it slower.
If you're not using ALSA, then adding (Parameter.set 'Duration_Stretch 1.5)
or similar may help.
-
works for me: config as above but second line replaced with `(Parameter.set 'Audio_Command "play -b 16 -c 1 -e signed-integer -r $SR -t raw $FILE tempo 1.5 pitch -100")` (1.5x faster and lower voice in my example). – d9k Nov 18 '17 at 21:38
-
Bash command without writing configuration file: `TEMPO=1.5 ; PITCH=-100 ; FILE_TO_PLAY_PATH=/tmp/readme.txt ; echo "(Parameter.set 'Audio_Command "\""play -q -b 16 -c 1 -e signed-integer -r \$SR -t raw \$FILE tempo ${TEMPO} pitch ${PITCH}"\"") (Parameter.set 'Audio_Method 'Audio_Command) (tts_file \""${FILE_TO_PLAY_PATH}"\" nil)" | festival --pipe` – d9k Nov 19 '17 at 21:28
If you are okay with writing a wrapper around, you can use sable and the RATE tag. For reference, here is an example project I made: http://www.cs.cmu.edu/~srallaba/Audio_Rendering_of_STEM/
in which technique 2 has rate variations.
Alternatively, you can use flite - festival lite. While festival was designed to enable research in speech synthesis, flite is ideal for real time implementations. The readme has an example to stretch duration using flite:
./bin/flite --setf duration_stretch=1.5 doc/alice
Hope it helps.

- 11
- 2
-
1Welcome to Stack Overflow! A link to a potential solution is always welcome, but please [add context around the link](http://meta.stackexchange.com/a/8259) so your fellow users will have some idea what it is and why it’s there. Always quote the most relevant part of an important link, in case the target site is unreachable or goes permanently offline. Take into account that being barely more than a link to an external site is a possible reason as to [Why and how are some answers deleted?](http://stackoverflow.com/help/deleted-answers). – FelixSFD Dec 23 '16 at 17:55
I had exactly the same problem and AFAIK, it is not possible to do that (I also hope to be wrong, so please correct me). It is also not possible to e.g. shift the frequency range of the voice. That is, without tinkering with the voice files (did not check this as it seems more than what I'd like to do).
Personally, I solved this by using the old mbrola voices and espeak. I used a python wrapper, used to invoke espeak from command line, but there is also a somehow old library. Despite the voice quality being lower than the CMU voices, the overall experience is IMHO sometimes better.

- 76
- 2
- 7
Consider using the Festival utility text2wave
to write the audio as a file, then play the file using sox
with the speed and pitch effects. To slow the audio down you will need a speed value less than one, and compensate for the effect on pitch with a positive value in pitch.

- 527
- 3
- 11