0

I am using pyaudio and pocketsphinx to listen to the microphone on my computer and translate what I am saying. What I was wanting to know is if it would be possible to have the program listen as it has been and after is hears the sentence, further process it by removing parts of the temporary wav file created by say 75% if it falls below a certain threshold? So for an example, you speak a sentence and the program waits for you to finish talking, then once it detects a long break, it stops listening and places the wav file data into a function that removes the space between words by 75% and then proceeds to pass that shortened wav file to the pocketsphinx library for speech recognition. I have heard of other solutions using numpy and scipy but that loop required the user to specify trim segments inside the wav spectrogram manually with a mouse. I am wanting to handle this automatically within the code. Any help would be greatly appreciated!

1 Answers1

1

The numpy and scipy solutions do not require user interaction as long as there is no GUI.

>>> from scipy.io.wavfile import read
>>> a = read("adios.wav")
>>> numpy.array(a[1],dtype=float)
array([ 128.,  128.,  128., ...,  128.,  128.,  128.])

scipy.signal has many build-in functions for that kind of operations.

There are already other posts on this topic:

Python: write a wav file into numpy float array

How to manipulate wav file data in Python?

What is the easiest way to read wav-files using Python [summary]?

Joe
  • 6,758
  • 2
  • 26
  • 47