0

I am creating a program to turn text into speech (TTS).

What I've done so far is to split a given word into syllables and then play each pre-recorded syllables.

For example:

INPUT: [TELEVISION]

OUTPUT: [TEL - E - VI - SION]

And then the program plays each sound in order:

First:   play TEL.wav
Second:  play E.wav
Third:   play VI.wav
Fourth:  play SION.wav

I am using wave and PyAudio to play each wav file:

wf = wave.open("sounds/%s.wav" %(ss), 'rb')
p = pyaudio.PyAudio()
stream = p.open(...)
data = wf.readframes(CHUNK)
stream.write(data)
... etc.

Now the problem is that during the playback there is a delay between each audio file and the spoken word sounds unnatural.

Is it possible to mix these audio files without creating a new file and play them with 0.2s delay between each audio file?

Edit: I tried Nullman's solution and it worked better than just calling a new wf on each sound. I also tried putting a crossfade following these instructions.

Community
  • 1
  • 1
kgb26
  • 183
  • 2
  • 14
  • 1
    try [this](https://stackoverflow.com/a/52702015/7540911) its supposed to join your files **in memory** so you don't write the result to disk and there wont be a delay between them – Nullman Mar 17 '19 at 12:57
  • 1
    Possible duplicate of [How to join two wav files using python?](https://stackoverflow.com/questions/2890703/how-to-join-two-wav-files-using-python) – handras Mar 17 '19 at 13:56

0 Answers0