Text-to-speech convert to wav in python

Question

I'm using pyttsx3 for text-to-speech tasks. Here's an example https://github.com/padmalcom/AISpeechAssistant/blob/main/code/02_text_to_speech/simple_main_02.py Is there a way to convert the spoken words directly to a wav file?

Does this answer your question? [how can i convert a text file to mp3 file using python pyttsx3 and sapi5?](https://stackoverflow.com/questions/53935096/how-can-i-convert-a-text-file-to-mp3-file-using-python-pyttsx3-and-sapi5) — Nick ODell, May 16 '23 at 20:20

score 0 · Accepted Answer · answered May 16 '23 at 20:29

I believe what you are asking for is this:

import pyttsx3
import wave

# Initialize the pyttsx3 engine
engine = pyttsx3.init()

# Set properties for the speech output (optional)
engine.setProperty('rate', 150)  # Speed of speech
engine.setProperty('volume', 1.0)  # Volume (0.0 to 1.0)

# Set the output file name
output_file = 'output.wav'

# Convert text to speech
text = "Hello, this is an example of text-to-speech conversion."
engine.save_to_file(text, output_file)

# Run the speech synthesis
engine.runAndWait()

# Optional: Get the audio data in the form of a wave file object
with wave.open(output_file, 'rb') as wav_file:
    # You can now manipulate the wave file object as needed
    # For example, you can get information about the audio file:
    frames = wav_file.getnframes()
    channels = wav_file.getnchannels()
    sample_width = wav_file.getsampwidth()
    frame_rate = wav_file.getframerate()
    duration = frames / float(frame_rate)

    print("Audio information:")
    print(f"Number of frames: {frames}")
    print(f"Number of channels: {channels}")
    print(f"Sample width: {sample_width}")
    print(f"Frame rate: {frame_rate}")
    print(f"Duration: {duration} seconds")

Text-to-speech convert to wav in python

1 Answers1