3

I have been looking into Speech Recognition and ways in which this can be implemented into a website. I have found many examples on using it with Python and even one with Node.js but I want to be able to use PHP with this.

Is there any way I can access CMUSphinx on a Linux server using PHP to process my inputs?

Thanks

JustSteveKing
  • 968
  • 1
  • 10
  • 29

2 Answers2

1

Can be done but to use asterisks as the audio capture and processing engine. See http://www.voip-info.org/wiki/view/Sphinx

Example code below after your server has been configured

    function sphinx($filename='', $timeout=3000, $service_port = 1069, $address = '127.0.0.1'){

        /* if a recording has not been passed in we create one */
        if ($filename=="") {
            $filename = "/var/lib/asterisk/sounds/sphinx_".$this->request['agi_uniqueid'];
            $extension = "wav";
            $this->stream_file('beep', 3000, 5);
            $this->record_file($filename, $extension, '0',$timeout);
            $filename=$filename.'.'.$extension;
        }   

        /* Create a TCP/IP socket. */
        $socket = socket_create(AF_INET, SOCK_STREAM, SOL_TCP);
        if ($socket < 0) {
            return false;
        }

        $result = socket_connect($socket, $address, $service_port);
        if ($result < 0) {
           return false;
        }

        //open the file and read in data
        $handle = fopen($filename, "rb");
        $data = fread($handle, filesize($filename));

        socket_write($socket, filesize($filename)."\n");
        socket_write($socket, $data);

        $response = socket_read($socket, 2048);

        socket_close($socket);

        unlink($filename);
        return $response;
   }

Another thought after looking at the website is that sphinx 4 allows web service access to the recognition processing daemon ie: run sphinx as a daemon (its java!) then you can do socket opens as above to feed a .wav into it directly basically using a modification of the code above so instead of calling the asterisks server to retrieve then record the audio you would use something else perhaps html5 etc to record the audio.

Another thing to consider is that chrome and html5 has built in speech recognition

Dave
  • 3,280
  • 2
  • 22
  • 40
  • It took me awhile to get Sphinx4 installed - but finally managed it! You said I could run it as a daemon - is there a default JAVA app for this? I have gone through countless tutorials just to get it installed, and know how to execute one of the example .jar files - but setting the whole thing as a daemon is past what I have read so far! Any advice?? So to clarify I can run Sphinx as a daemon then use sockets to send an audio file back and forward to be processed. – JustSteveKing Sep 05 '14 at 15:44
  • 1
    just google "run java app as daemon on linux" plenty of examples using upstart or systemd etc to run as java app on runtime with boot – Dave Sep 08 '14 at 08:45
  • Thanks @Dave seems like the most logical way forward with this. Looking at getting the sphinx-asterisk integration working so I can test this out. One thing is for sure - this isn't for the faint hearted! – JustSteveKing Sep 08 '14 at 10:25
  • Indeed it seems massively overly complex. Its something that perhaps should be written in as a php module/extension even a PECL extension or something since voice recognition is becoming more and more popular. I'd have thought there'd have been more support for it since its in the html 5 spec perhaps even via asm.js or something. – Dave Sep 08 '14 at 13:33
0

The architecture of such system depends on the audio type you want to process. If audio is long you can just store it into temporary file and invoke pocketsphinx_continuous as an external tool to process it:

http://php.net/manual/en/function.shell-exec.php

You call pocketsphinx_continuous -infile file.wav > decode-result.txt and that give you a result to display. The problem with this approach is that decoding initialization takes time so you will not be able to use that approach for short files.

If you want to process short samples or you want to process audio in streaming mode, you need some kind of server to load models and wait for requests. There are different variants on how to implement it from simple hand-made server listening on TCP port with simple protocol and accepting data to more complex solutions like http://unimrcp.org

Nikolay Shmyrev
  • 24,897
  • 5
  • 43
  • 87