4

I'm following this Apple Sample Code and was wondering how it would be possible to clear the input buffer (ie. result in this case), so that the dictation restarts once a word has been said.

For example:

When a users says words, they are added to result.bestTranscription.formattedString then are appended. So if I said, "White", "Purple", "Yellow", result.bestTranscription.formattedString would look like "White Purple Yellow" and words would keep being appended until the buffer stops (~1min apparently). I was making an action happen when a word was spoken, so if you say "blue" for instance, I was checking to see if the buffer contains "blue" (or "Blue") and since it does, move on to the next activity and reset the buffer.

When I do that, however, I get this error:

2020-09-09 18:25:44.247575-0400 Test App[28460:1337362] [Utility] +[AFAggregator logDictationFailedWithError:] Error Domain=kAFAssistantErrorDomain Code=209 "(null)"

It works fine to stop the audio detection when it hears "blue", but as soon as I try to re-initialize the speech recognition code, it chokes. Below is my recognitionTask:

// Create a recognition task for the speech recognition session.
// Keep a reference to the task so that it can be canceled.
recognitionTask = speechRecognizer.recognitionTask(with: recognitionRequest) { result, error in
    var isFinal = false
    
    if let result = result {
        // Update the text view with the results.
        self.speechInput = result.bestTranscription.formattedString
        isFinal = result.isFinal
        print("Text \(result.bestTranscription.formattedString)")
    }
    
    if (result?.bestTranscription.formattedString.range(of:"blue") != nil) || (result?.bestTranscription.formattedString.range(of:"Blue") != nil) {
        self.ColorView.backgroundColor = .random()
        isFinal = true
    }
    
    if error != nil || isFinal {
        // Stop recognizing speech if there is a problem.
        self.audioEngine.stop()
        self.recognitionRequest?.endAudio() // Necessary
        inputNode.removeTap(onBus: 0)

        self.recognitionRequest = nil
        self.recognitionTask = nil
    }
    
}

Here's the full code:

import UIKit
import Speech

private let audioEngine = AVAudioEngine()


extension CGFloat {
    static func random() -> CGFloat {
        
        return CGFloat(arc4random()) / CGFloat(UInt32.max)
    }
}

extension UIColor {
    static func random() -> UIColor {
        return UIColor(
           red:   .random(),
           green: .random(),
           blue:  .random(),
           alpha: 1.0
        )
    }
}

class ViewController: UIViewController, SFSpeechRecognizerDelegate  {
    @IBOutlet var ColorView: UIView!
    @IBOutlet weak var StartButton: UIButton!
    
    private let speechRecognizer = SFSpeechRecognizer()!
    private var recognitionRequest: SFSpeechAudioBufferRecognitionRequest?
    private var recognitionTask: SFSpeechRecognitionTask?
    private let audioEngine = AVAudioEngine()
    
    private var speechInput: String = ""
    
    override func viewDidLoad() {
        super.viewDidLoad()
        
        // Create and configure the speech recognition request.
        recognitionRequest = SFSpeechAudioBufferRecognitionRequest()
        guard let recognitionRequest = recognitionRequest else { fatalError("Unable to create a SFSpeechAudioBufferRecognitionRequest object") }
        recognitionRequest.shouldReportPartialResults = true
        
        // Do any additional setup after loading the view.
        
    }
    
    func start() {
        // Cancel the previous task if it's running.
        recognitionTask?.cancel()
        self.recognitionTask = nil
        
        // Configure the audio session for the app.
        let audioSession = AVAudioSession.sharedInstance()
        do {
            try audioSession.setCategory(.record, mode: .measurement, options: .duckOthers)
        } catch {}
        do {
            try audioSession.setActive(true, options: .notifyOthersOnDeactivation)
            
        } catch {}
        let inputNode = audioEngine.inputNode

        // Create and configure the speech recognition request.
        recognitionRequest = SFSpeechAudioBufferRecognitionRequest()
        guard let recognitionRequest = recognitionRequest else { fatalError("Unable to create a SFSpeechAudioBufferRecognitionRequest object") }
        recognitionRequest.shouldReportPartialResults = true
        
        // Keep speech recognition data on device
        if #available(iOS 13, *) {
            recognitionRequest.requiresOnDeviceRecognition = false
        }
        
        // Create a recognition task for the speech recognition session.
        // Keep a reference to the task so that it can be canceled.
        recognitionTask = speechRecognizer.recognitionTask(with: recognitionRequest) { result, error in
            var isFinal = false
            
            if let result = result {
                // Update the text view with the results.
                self.speechInput = result.bestTranscription.formattedString
                isFinal = result.isFinal
                print("Text \(result.bestTranscription.formattedString)")
            }
            
            if (result?.bestTranscription.formattedString.range(of:"blue") != nil) || (result?.bestTranscription.formattedString.range(of:"Blue") != nil) {
                self.ColorView.backgroundColor = .random()
                isFinal = true
            }
            
            if error != nil || isFinal {
                // Stop recognizing speech if there is a problem.
                self.audioEngine.stop()
                self.recognitionRequest?.endAudio() // Necessary
                inputNode.removeTap(onBus: 0)

                self.recognitionRequest = nil
                self.recognitionTask = nil
                
                self.start() //This chokes it
            }
            
        }

        // Configure the microphone input.
        let recordingFormat = inputNode.outputFormat(forBus: 0)
        inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer: AVAudioPCMBuffer, when: AVAudioTime) in
            self.recognitionRequest?.append(buffer)
        }
        
        audioEngine.prepare()
        do {
            try audioEngine.start()
        } catch {}
        
        // Let the user know to start talking.
        //textView.text = "(Go ahead, I'm listening)"
    }
    
    override func viewDidAppear(_ animated: Bool) {
        super.viewDidAppear(animated)
        // Configure the SFSpeechRecognizer object already
        // stored in a local member variable.
        speechRecognizer.delegate = self
        
        // Asynchronously make the authorization request.
        SFSpeechRecognizer.requestAuthorization { authStatus in

            // Divert to the app's main thread so that the UI
            // can be updated.
            OperationQueue.main.addOperation {
                switch authStatus {
                case .authorized:
                    self.ColorView.backgroundColor = .green
                    
                case .denied:
                    self.ColorView.backgroundColor = .red
                    
                case .restricted:
                    self.ColorView.backgroundColor = .orange
                    
                case .notDetermined:
                    self.ColorView.backgroundColor = .gray
                    
                default:
                    self.ColorView.backgroundColor = .random()
                }
            }
        }
    }
    
    @IBAction func start(_ sender: Any) {
        start()
    }
    
    public func speechRecognizer(_ speechRecognizer: SFSpeechRecognizer, availabilityDidChange available: Bool) {
        if available {
            //recordButton.isEnabled = true
            //recordButton.setTitle("Start Recording", for: [])
        } else {
            //recordButton.isEnabled = false
            //recordButton.setTitle("Recognition Not Available", for: .disabled)
        }
    }
}

I'm sure there's something simple I'm missing, any advise?

WiseOlMan
  • 926
  • 2
  • 11
  • 26
  • "but as soon as I try to re-initialize the speech recognition code, it chokes" Where is that part of the code, please? – matt Sep 09 '20 at 22:38
  • following the Apple Sample Code linked above. If you download the code, you will find that the code is all contained within a function, within if error != nil || isFinal, I tried calling that function again (self.startRecording()), after self.recognitionTask = nil – WiseOlMan Sep 10 '20 at 00:14
  • Let me put it another way. I don't want to follow a link, I don't want to guess what you're doing. I ask you to _show your actual code_. Not part of your code. The code that shows _how to reproduce the problem_. – matt Sep 10 '20 at 01:38
  • @matt, sorry for the delay, attached is the code as requested. – WiseOlMan Sep 13 '20 at 13:38

0 Answers0