Creating an array of words from a stream of audio from speech-recognition

Question

So I'm currently using the python library 'SpeechRecognition' in order to get phrases in between pauses from audio received from my microphone.

However what I need is to be able to print each word out as I continuously speak. But I don't know how to do that.

I'm eventually going to get to the point where I analyze a set number of words to look for a key phrase. My plan is to use multithreading in order to analyze the code at intervals.

Here's my current code

import string
import threading
import speech_recognition as sr

from threading import Thread

# obtain audio
def voiceRecognition():
    while True:
        audioText = ''
        r = sr.Recognizer()
        with sr.Microphone() as source:
            audio = r.listen(source)
            try:
                audioText = r.recognize_google(audio)
                print(audioText)
            except sr.UnknownValueError:
                pass


if __name__ == '__main__':
    Thread(target = voiceRecognition).start()

Possible duplicate of [String to list in Python](https://stackoverflow.com/questions/5453026/string-to-list-in-python) — Nikolay Shmyrev, Jan 25 '18 at 15:12

score 0 · Answer 1 · answered Mar 17 '18 at 23:29

Comparing what I have working to what you have, I would put the try block outside of/on the same level as the with sr.Microphone()... block, as shown below

    with sr.Microphone() as source:
        audio = r.listen(source)
   try:
        audioText = r.recognize_google(audio)
        print(audioText)

Also, maybe outside of the scope of the question, but I use the TextBlob (https://pypi.python.org/pypi/textblob) package, which uses the NLTK platform (http://www.nltk.org/) . You might be interested in that for parsing the results.

Wow, this was so long ago I completely forgot about this. This nltk stuff seems pretty cool I'll have to check it out. But I found an answer and completely forgot to answer it. — Ernest.V, Mar 19 '18 at 00:24

score 0 · Accepted Answer · edited Mar 19 '18 at 00:44

0

I used multi-threading and limited the amount of audio that could be recorded by each thread to 5 seconds, so that Google could handle the length of translation. As a thread finished listening, it would allow a new thread to enter, then it would make a translation, and so on.

edited Mar 19 '18 at 00:44

Stephen Rauch

47,830
31
106
135

answered Mar 19 '18 at 00:26

Ernest.V

3
3

Creating an array of words from a stream of audio from speech-recognition

2 Answers2