8

I want to make a chatbot's response in audio and text.

All the example code using gTTS seem like one needs to 'save the text into a file then play the file'.

Is there another way to simplify the process such as, play the 'response from chatbot' automatically, using gTTS?

June Wang
  • 499
  • 2
  • 6
  • 20
  • What examples did you look at? There are three examples in the docs, and only one of them requires `save`. They even call the section for the last one ["Playing sound directly"](http://gtts.readthedocs.io/en/latest/module.html#playing-sound-directly). – abarnert Jul 03 '18 at 23:40
  • well, you'll still have to type in 'hello' first. Is there a way to pass a variable and play it? – June Wang Jul 04 '18 at 01:48
  • `gTTS` doesn't know or care whether the string comes from a variable or a literal in your code, same as every other function in Python. Just like you can type `print('hello')` or `print(my_variable)`, you can type `gTTS('hello', 'en')` or `gTTS(my_variable, 'en')`. – abarnert Jul 04 '18 at 02:09
  • I c. Good to know that. Thanks. – June Wang Jul 04 '18 at 02:11

5 Answers5

6

If you look even briefly at the docs, you'll see that, of the three examples, only one of them requires you to call save, and the third one is specifically called "Playing sound directly".

So, just do exactly what's in that example, but substitute your string in place of the literal 'hello':

>>> from gtts import gTTS
>>> from io import BytesIO
>>>
>>> my_variable = 'hello' # your real code gets this from the chatbot
>>> 
>>> mp3_fp = BytesIO()
>>> tts = gTTS(my_variable, 'en')
>>> tts.write_to_fp(mp3_fp)

But notice that gTTS doesn't come with an MP3 player; you need a separate audio library to play that mp3_fp buffer:

>>> # Load `audio_fp` as an mp3 file in
>>> # the audio library of your choice

As the docs say, there are many such libraries, and Stack Overflow is not a good place to get recommendations for libraries. I happen to have a library installed, named musicplayer, and a sample app that can be easily adapted here, but it's probably not the simplest one by a long shot (it's made for doing more powerful, low-level stuff):

>>> import musicplayer
>>> class Song:
...     def __init__(self, f):
...         self.f = f
...     def readPacket(self, size):
...         return self.f.read(size)
...     def seekRaw(self, offset, whence):
...         self.f.seek(offset, whence)
...         return f.tell()
>>> player = musicplayer.createPlayer()
>>> player.queue = [Song(mp3_fp)]
>>> player.playing = True
abarnert
  • 354,177
  • 51
  • 601
  • 671
3

if you want to call speak function again and again without any error.

Basically, this serves the purpose.

from gtts import gTTS
import os
import playsound

def speak(text):
    tts = gTTS(text=text, lang='en')

    filename = "abc.mp3"
    tts.save(filename)
    playsound.playsound(filename)
    os.remove(filename)
2

One of the solution that I found is by using pygame.mixer. In this case, import time is only used to ensure audio finishes before program ends.

from gtts import gTTS
from io import BytesIO
from pygame import mixer
import time

def speak():
    mp3_fp = BytesIO()
    tts = gTTS('hello, Welcome to Python Text-to-Speech!', lang='en')
    tts.write_to_fp(mp3_fp)
    return mp3_fp

mixer.init()
sound = speak()
sound.seek(0)
mixer.music.load(sound, "mp3")
mixer.music.play()
time.sleep(5)
1

[Linux] Speech in Python

Installation

  1. [Terminal] Upgrade pip: pip install --upgrade pip
  2. [Terminal] Install Google Text to Speech: pip install gTTS
  3. [Terminal] Install pygame: pip install pygame
  4. [Coding IDE] Add speech.py: See listing below
  5. [Coding IDE] Call speak: See listing below

speech.py

from gtts import gTTS
from io import BytesIO
import pygame

class Speech():

    @classmethod
    def speak(cls, text):
        mp3_file_object = BytesIO()
        tts = gTTS(text, lang='en')
        tts.write_to_fp(mp3_file_object)
        pygame.init()
        pygame.mixer.init()
        pygame.mixer.music.load(mp3_file_object, 'mp3')
        pygame.mixer.music.play()

Example

from .speech import Speech
Speech.speak('hello world')

Warning

It's a female voice and sounds realistic. It sounds like there's a woman in the room, fwiw.

toddmo
  • 20,682
  • 14
  • 97
  • 107
  • What is the purpose of the class structure? Is that just a style thing? – Shep Bryan Feb 05 '23 at 03:44
  • @ShepBryan, to clarify your question, you would want the code to be just `speak` and `stop` and `goFaster`, not `Speech.speak`, `Speech.stop` or `Speech.goFaster`? – toddmo Feb 05 '23 at 16:07
-5

You can also use the playsound library.

>>>import playsound

>>>playsound.playsound('sound.mp3')

For more information on playsound.Visit Playsound Docs .