1

The following code I tried to run it on Raspberry Pi 3 Model B which has a little big of capacity on it's memory, the problem that I'm facing with the code is that it runs sometimes:

from os import environ, path
import pyaudio
from pocketsphinx.pocketsphinx import *
from sphinxbase.sphinxbase import *

MODELDIR = "../../../model"
DATADIR = "../../../test/data"

config = Decoder.default_config()
config.set_string('-hmm', path.join(MODELDIR, 'en-us/en-us'))
config.set_string('-lm', path.join(MODELDIR, '3199.lm'))
config.set_string('-dict', path.join(MODELDIR, '3199.dic'))
config.set_string('-logfn', '/dev/null')
decoder = Decoder(config)

p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paInt16, channels=1, rate=16000, input=True, frames_per_buffer=1024)
stream.start_stream()

in_speech_bf = False
decoder.start_utt()
while True:
    buf = stream.read(1024)
    if buf:
        decoder.process_raw(buf, False, False)
        if decoder.get_in_speech() != in_speech_bf:
            in_speech_bf = decoder.get_in_speech()
            if not in_speech_bf:
                decoder.end_utt()
                result = decoder.hyp().hypstr

                print 'Result:', result
                if result == 'yes':
                      print 'Do whatever you want'

                decoder.start_utt()
    else:
        break
decoder.end_utt()

The program keep crashing and throws the following exception: OSError: [-9985] Errno Device unavailable

0x01Brain
  • 798
  • 2
  • 12
  • 28
  • The code you linked does not read from microphone. You need to provide the code you are actually using and the complete log output, not just error. Large vocabulary continuous speech recognition with the language model is not possible with pocketsphinx on RaspberryPI, it is too slow. – Nikolay Shmyrev Sep 03 '16 at 06:48
  • Yes sorry, I have updated the question post. – 0x01Brain Sep 03 '16 at 18:14
  • You can try to record at 44.1khz if your audio config supports just that. Then you need to add options `-samprate 44100 -nfft 2048` to decoder config. Alternatively you can properly configure pulseaudio/alsa-dmix on the system to do resampling for you. – Nikolay Shmyrev Sep 04 '16 at 09:12

1 Answers1

1

First try opening and closing stream.

p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paInt16, channels=1, rate=16000, input=True, frames_per_buffer=1024)
# stream.start_stream()
in_speech_bf = False
decoder.start_utt()
while True:
    if stream.is_stopped():
        stream.start_stream() 
    buf = stream.read(1024)
    if buf:
        stream.stop_stream()
        decoder.process_raw(buf, False, False)

If you still face issue then try Alsa Plug plugin in ~/.asoundrc

pcm.record {
    type plug;
    slave {
        pcm "hw:<CARD>,<DEVICE>"
    }
}

Find out CAPTURE Device (Soundcard used for audio recording) and note down CARD number and DEVICE number. In below example both are 0. Replace CARD and DEVICE value in plugin above.

> arecord -l

**** List of CAPTURE Hardware Devices ****
card 0: Devices [USB Device 2345:3x55], device 0: USB Audio [USB Audio]

Now plugin will looks like

pcm.record {
    type plug;
    slave {
        pcm "hw:0,0"
    }
}

Save ~/.asoundrc file and reboot RPi. Now find out index of newly created device (pcm.record) using following python script.

import pyaudio
po = pyaudio.PyAudio()
for index in range(po.get_device_count()): 
    desc = po.get_device_info_by_index(index)
    if desc["name"] == "record":
        print "DEVICE: %s  INDEX:  %s  RATE:  %s " %  (desc["name"], index,  int(desc["defaultSampleRate"]))

It will output INDEX (9 here but in your case it might be different)

DEVICE: record  INDEX:  9  RATE:  48000 

Now change your main script little bit (insert input_device_index=9 in p.open() )

stream = p.open(format=pyaudio.paInt16, 
                channels=1, 
                rate=16000, 
                input=True, 
                input_device_index=9,
                frames_per_buffer=1024)

Thats all, run your script again, see if issue reolve.

g10dras
  • 399
  • 2
  • 11
  • Thanks it worked now, What I exactly did, I had to terminate the audio engine and end utterance once it recognizes a user speech. – 0x01Brain Sep 26 '16 at 14:27