20

I'm developing a website, and I would like to help blind people to use it by the voice, so I will use:

  • Text-to-speech, to give some posibilities to the user
  • Speech-to-text, to allow user to use her voice to select one

I already have some text-to-speech JavaScript libraries (like speak.js), but now I need a good speech-to-text one. There are some solutions for this purpose (like speechapi), but they use Java Applets or Flash, and I want to depend only on JavaScript, to avoid plugins.

I'm trying HTML5's speech input with x-webkit-speech and Google Chrome, and it is good, but you need to click over an icon (and blind people can't use a mouse well). Is it posible to use x-webkit-speech pressing a key? Do you know any alternative API (JavaScript)?

Thank you!

sgmonda
  • 2,615
  • 1
  • 19
  • 29
  • 3
    Shouldn't speech navigation be part of the browser, instead of the website? – Bergi Jun 28 '12 at 09:51
  • 1
    @Bergi The navigation could be a browser responsiility, but my website is more complex. It has simple games to improve your brain capabilities, and my intention is to adapt some of them to blind people, so browser couldn't manage my HTML5 games. – sgmonda Jun 28 '12 at 09:58
  • 4
    @Bergi is right. Most users who need this sort of thing will be using a text-to-speech system already, not a normal browser. Your idea sounds good but it will get in the way of their existing solution. Conforming to the established accessibility standards is very important to this demographic, so be very wary of doing anything that could be seen as re-inventing the wheel or going against the existing standards. – SDC Jun 28 '12 at 10:39
  • @SDC my intention is to adapt my site navigation to standards of this type (I have to investigate a bit), but I still need to do what I want for my games. – sgmonda Jun 28 '12 at 10:43
  • @sgmonda: Do you actually know any blind computer users? Remember, just because someone is blind, that doesn't mean they can't use a keyboard. In fact, to the best of my knowledge, blind people often prefer a keyboard interface, because that allows them to get immediate feedback (via text-to-speech) on every letter they type. With voice recognition, they'd have to listen to their sentence read back to ensure it came out right. – Daniel Pryden Jun 29 '12 at 03:50
  • I have planned to offer keyboard interaction, too, but speech could be very interesting for some games i have in mind. – sgmonda Jun 29 '12 at 09:24
  • If you are looking for a portable speech recognition solution in HTML5 browser supporting audio input API, check Javascript port of the Pocketsphinx: https://github.com/syl22-00/pocketsphinx.js. It's based on CMUSphinx, but works in a browser on client side. – Nikolay Shmyrev Jul 02 '13 at 17:51

2 Answers2

4

Is it posible to use x-webkit-speech pressing a key?

According to the this post and this post you cannot override the start of speech by clicking the microphone.

What the x-webkit-speech is doing is using the audio capture capabilities of HTML5 and sending the audio to Google's servers for processing, returning the results in JSON. This blogger has reversed engineered it. You could develop a JavaScript library that looks for a key press to start capturing audio on HTML5 enabled browsers and send it to Google's service or to one you have created. The downside to using Google's service is that it is an unsupported API and subject to change at any time. The downside to developing your own service is that it can be expensive to develop and maintain.

Do you know any alternative API (JavaScript)?

This post and this post lists some services available for speech recognition. I did not see Nuance listed. You may be able to use the Dragon Mobile SDK for this. And you may want to check into ISpeech.

Community
  • 1
  • 1
Kevin Junghans
  • 17,475
  • 4
  • 45
  • 62
2

Google Translate is very good Text To Speech Engine. I used to read a text with it. For example you have a text: welcome to Stack overflow you can call like this

http://translate.google.com/translate_tts?ie=UTF-8&q=Welcome%20to%20stack%20overflow&tl=en&total=1&idx=0&textlen=23&prev=input

then use browser audio to play it

For speech input you can manual activate listening process, see here http://code.google.com/chrome/extensions/experimental.speechInput.html

James
  • 13,571
  • 6
  • 61
  • 83
  • But then, user would need to install an extension for Chrome. It is a possibility, but I wouldn't like to depend on an extension. Ideally, user only would need to open his browser (now, Chrome is the only one which supports speech inputs, but this is in HTML5 specification, so all browser will support it with time) – sgmonda Jun 28 '12 at 09:52