1

Currently, I can click any of the 4 buttons in a HTML, using Javascript running in tampermonkey, to select the ID of the button DIV to click. However I want to use speech recognition to click any of the 4 buttons by speaking any of the following words, NONE, ONE, TWO, THREE. I am guessing that the speech script will change the word I speak to text which will be added to a javascript array which will be matched to a DIV ID to be clicked. How to achieve this using javascript. thanks

  document.getElementById('radio0').click();

    <div class="radio-container">
     <div class="col-6">
      <button id="radio0">None</button>
     </div>
    <div class="col-6">
     <button id="radio1">One</button>
    </div>
    <div class="col-6">
     <button id="radio2">Two</button>
    </div>
    <div class="col-6">
     <button id="radio3">Three +</button>
    </div>
  </div> 
Joseph
  • 313
  • 1
  • 4
  • 16

2 Answers2

1

Come up with an array of button names. Because SpeechRecognition recognizes numbers as the actual numbers (eg 1, not one), use the numeric values rather than their word representations.

var buttonNames = [ 'None', '1', '2', '3'];

I had trouble giving an embedded StackSnippet permission to access the microphone (probably has to do with cross-domain and sandboxing rules), so I put all the code in a userscript. It replaces the page's HTML with your HTML. Click on the document body and the recognition will start. (Open your browser's console to see what it's doing) Then, speak one of the button names. (Make sure Stack Overflow - or whatever domain you run the userscript on - has permission to listen to your microphone)

When the onresult handler is triggered (when you stop speaking), identify the last word in the transcript, and see if it matches any of the buttonNames. If so, querySelectorAll the buttons in the document, and .click() the appropriate button index.

// ==UserScript==
// @name         Userscript Speech Recognition
// @namespace    CertainPerformance
// @version      1
// @match        https://stackoverflow.com/questions/51702275/click-button-using-javascript-speech-recognition-tampermonkey
// @grant        none
// ==/UserScript==

document.head.innerHTML = '';
document.body.innerHTML = `
    <div class="radio-container" style="height:1000px">
         <div class="col-6">
          <button id="radio0">None</button>
         </div>
        <div class="col-6">
         <button id="radio1">One</button>
        </div>
        <div class="col-6">
         <button id="radio2">Two</button>
        </div>
        <div class="col-6">
         <button id="radio3">Three +</button>
        </div>
      </div>
`;

document.addEventListener('click', ({ target }) => {
  if (!target.matches('button')) return;
  console.log('Click detected: ' + target.outerHTML);
});
var SpeechRecognition = SpeechRecognition || webkitSpeechRecognition
var SpeechGrammarList = SpeechGrammarList || webkitSpeechGrammarList
var SpeechRecognitionEvent = SpeechRecognitionEvent || webkitSpeechRecognitionEvent


var buttonNames = [ 'None', '1', '2', '3'];

var recognition = new SpeechRecognition();

document.body.onclick = function(e) {
  if (e.target.matches('button')) return;
  recognition.start();
  console.log('Listening');
}

recognition.onresult = function(event) {
  var last = event.results.length - 1;
  var speechText = event.results[last][0].transcript;
  console.log('Heard ' + speechText);
  const foundButtonIndex = buttonNames.findIndex(buttonName => buttonName === speechText);
  console.log(foundButtonIndex);
  if (foundButtonIndex !== -1) document.querySelectorAll('button')[foundButtonIndex].click();
}

recognition.onspeechend = function() {
  recognition.stop();
}

recognition.onnomatch = function(event) {
  console.log('Not recognized')
}

recognition.onerror = function(event) {
  console.log('Error ' + event.error);
}

For a more generic solution when the buttons can have any text inside them, and you want to be able to speak the button text and have the appropriate button clicked, you might querySelectorAll all buttons on pageload, map them to an object with keys corresponding to their text content, and then click buttonObj[speechText] if it exists.

CertainPerformance
  • 356,069
  • 52
  • 309
  • 320
  • mehn you seem to have nailed the whole issues, thanks. I will need time to go through your code. I will get back to you in a few hours. Thanks a lot – Joseph Aug 06 '18 at 08:50
0

You could select the div by checking the innerHTML of the div with the input you get from speech to text. To match the element you could use the answers from this link Javascript .querySelector find <div> by innerTEXT

SPS
  • 147
  • 11