am trying to use SFSpeechRecognizer for speech to text, after speaking a welcome message to the user via AVSpeechUtterance. But randomly, the speech recognition does not start(after speaking the welcome message) and it throws the error message below.
[avas] ERROR: AVAudioSession.mm:1049: -[AVAudioSession setActive:withOptions:error:]: Deactivating an audio session that has running I/O. All I/O should be stopped or paused prior to deactivating the audio session.
It works few times. Am not clear on why is it not working consistently.
I tried the solutions mentioned in other SO posts, where it mentions to check if there are audio players running. I added that check in the speech to text part of the code. It returns false (i.e. no other audio player is running) But still the speech to text does not start listening for the user speech. Can you pls guide me on what is going wrong.
Am testing on iPhone 6 running iOS 10.3
Below are code snippets used:
TextToSpeech:
- (void) speak:(NSString *) textToSpeak {
[[AVAudioSession sharedInstance] setActive:NO withOptions:0 error:nil];
[[AVAudioSession sharedInstance] setCategory:AVAudioSessionCategoryPlayback
withOptions:AVAudioSessionCategoryOptionDuckOthers error:nil];
[synthesizer stopSpeakingAtBoundary:AVSpeechBoundaryImmediate];
AVSpeechUtterance* utterance = [[AVSpeechUtterance new] initWithString:textToSpeak];
utterance.voice = [AVSpeechSynthesisVoice voiceWithLanguage:locale];
utterance.rate = (AVSpeechUtteranceMinimumSpeechRate * 1.5 + AVSpeechUtteranceDefaultSpeechRate) / 2.5 * rate * rate;
utterance.pitchMultiplier = 1.2;
[synthesizer speakUtterance:utterance];
}
- (void)speechSynthesizer:(AVSpeechSynthesizer*)synthesizer didFinishSpeechUtterance:(AVSpeechUtterance*)utterance {
//Return success message back to caller
[[AVAudioSession sharedInstance] setActive:NO withOptions:0 error:nil];
[[AVAudioSession sharedInstance] setCategory:AVAudioSessionCategoryAmbient
withOptions: 0 error: nil];
[[AVAudioSession sharedInstance] setActive:YES withOptions: 0 error:nil];
}
Speech To Text:
- (void) recordUserSpeech:(NSString *) lang {
NSLocale *locale = [[NSLocale alloc] initWithLocaleIdentifier:lang];
self.sfSpeechRecognizer = [[SFSpeechRecognizer alloc] initWithLocale:locale];
[self.sfSpeechRecognizer setDelegate:self];
NSLog(@"Step1: ");
// Cancel the previous task if it's running.
if ( self.recognitionTask ) {
NSLog(@"Step2: ");
[self.recognitionTask cancel];
self.recognitionTask = nil;
}
NSLog(@"Step3: ");
[self initAudioSession];
self.recognitionRequest = [[SFSpeechAudioBufferRecognitionRequest alloc] init];
NSLog(@"Step4: ");
if (!self.audioEngine.inputNode) {
NSLog(@"Audio engine has no input node");
}
if (!self.recognitionRequest) {
NSLog(@"Unable to created a SFSpeechAudioBufferRecognitionRequest object");
}
self.recognitionTask = [self.sfSpeechRecognizer recognitionTaskWithRequest:self.recognitionRequest resultHandler:^(SFSpeechRecognitionResult *result, NSError *error) {
bool isFinal= false;
if (error) {
[self stopAndRelease];
NSLog(@"In recognitionTaskWithRequest.. Error code ::: %ld, %@", (long)error.code, error.description);
[self sendErrorWithMessage:error.localizedFailureReason andCode:error.code];
}
if (result) {
[self sendResults:result.bestTranscription.formattedString];
isFinal = result.isFinal;
}
if (isFinal) {
NSLog(@"result.isFinal: ");
[self stopAndRelease];
//return control to caller
}
}];
NSLog(@"Step5: ");
AVAudioFormat *recordingFormat = [self.audioEngine.inputNode outputFormatForBus:0];
[self.audioEngine.inputNode installTapOnBus:0 bufferSize:1024 format:recordingFormat block:^(AVAudioPCMBuffer * _Nonnull buffer, AVAudioTime * _Nonnull when) {
//NSLog(@"Installing Audio engine: ");
[self.recognitionRequest appendAudioPCMBuffer:buffer];
}];
NSLog(@"Step6: ");
[self.audioEngine prepare];
NSLog(@"Step7: ");
NSError *err;
[self.audioEngine startAndReturnError:&err];
}
- (void) initAudioSession
{
AVAudioSession *audioSession = [AVAudioSession sharedInstance];
[audioSession setCategory:AVAudioSessionCategoryRecord error:nil];
[audioSession setMode:AVAudioSessionModeMeasurement error:nil];
[audioSession setActive:YES withOptions:AVAudioSessionSetActiveOptionNotifyOthersOnDeactivation error:nil];
}
-(void) stopAndRelease
{
NSLog(@"Invoking SFSpeechRecognizer stopAndRelease: ");
[self.audioEngine stop];
[self.recognitionRequest endAudio];
[self.audioEngine.inputNode removeTapOnBus:0];
self.recognitionRequest = nil;
[self.recognitionTask cancel];
self.recognitionTask = nil;
}
Regarding the logs added, am able to see all logs till "Step7" printed.
When debugging the code in the device, it consistently triggers break at the below lines (I have exception breakpoints set) though, continue keeps on with the execution. It however happens same way during few successful executions as well.
AVAudioFormat *recordingFormat = [self.audioEngine.inputNode outputFormatForBus:0];
[self.audioEngine prepare];