3

I am trying to get a clumsy Objective-C proof-of-concept example to run with SFSpeechRecognizer on Catalina transcribing a local audio file.

After some googling I have managed to get the authorization to work by adding an Info.plist with NSSpeechRecognitionUsageDescription and I get the authorization dialog and the correct SFSpeechRecognizerAuthorizationStatus (SFSpeechRecognizerAuthorizationStatusAuthorized).

However, my SFSpeechRecognizer instance still is unavailable. I suspect, I must be making a stupid mistake due to lack of basic Objective-C knowledge.

Any hints greatly appreciated.

Here's my code:

//
//  main.m
//  SpeechTestCatalina
//

#import <Foundation/Foundation.h>
#import <Speech/Speech.h>

void transcribeTestFile(){
    NSLocale *locale =[[NSLocale alloc] initWithLocaleIdentifier:@"en-US"];
    SFSpeechRecognizer *speechRecognizer = [[SFSpeechRecognizer alloc] initWithLocale:locale];

    NSLog(@"Locale %@, %@", speechRecognizer.locale.languageCode, speechRecognizer.locale.countryCode);
    NSLog(@"Available %hhd", speechRecognizer.available);
    NSLog(@"Auth status %ld", [SFSpeechRecognizer authorizationStatus]);
    NSLog(@"Supports on device %hhd", speechRecognizer.supportsOnDeviceRecognition);
    if(speechRecognizer.isAvailable && speechRecognizer.supportsOnDeviceRecognition){
        NSString *audioFilePath = @"/Users/doe/speech-detection/speech_sample.wav";
        NSURL *url = [[NSURL alloc] initFileURLWithPath:audioFilePath];
        NSLog(@"Analyzing %@ in language %@", url, locale.languageCode);
        SFSpeechURLRecognitionRequest *urlRequest = [[SFSpeechURLRecognitionRequest alloc] initWithURL:url];
        urlRequest.requiresOnDeviceRecognition = true;
        urlRequest.shouldReportPartialResults = YES; // YES if animate writting
        [speechRecognizer recognitionTaskWithRequest: urlRequest resultHandler:  ^(SFSpeechRecognitionResult * _Nullable result, NSError * _Nullable error){
            NSString *transcriptText = result.bestTranscription.formattedString;
            if(!error){
                NSLog(@"Transcript: %@", transcriptText);
            } else {
                NSLog(@"Error: %@", error);
            }
        }];
    } else {
        NSLog(@"speechRecognizer is not available on this device");
    }
}


int main(int argc, const char * argv[]) {
    @autoreleasepool {
        [SFSpeechRecognizer requestAuthorization:^(SFSpeechRecognizerAuthorizationStatus authStatus) {
            NSLog(@"Status: %ld", (long)authStatus);
            switch (authStatus) {
                case SFSpeechRecognizerAuthorizationStatusAuthorized:
                    //User gave access to speech recognition
                    NSLog(@"Authorized");

                    transcribeTestFile();

                    break;

                case SFSpeechRecognizerAuthorizationStatusDenied:
                    //User denied access to speech recognition
                    NSLog(@"SFSpeechRecognizerAuthorizationStatusDenied");
                    break;

                case SFSpeechRecognizerAuthorizationStatusRestricted:
                    //Speech recognition restricted on this device
                    NSLog(@"SFSpeechRecognizerAuthorizationStatusRestricted");
                    break;

                case SFSpeechRecognizerAuthorizationStatusNotDetermined:
                    //Speech recognition not yet authorized

                    break;

                default:
                    NSLog(@"Default");
                    break;
            }
        }];

        NSLog(@"Sleeping");
        [NSThread sleepForTimeInterval:20.0f];

    }
    return 0;
}

The output when I run it is:

2020-01-26 17:48:39.454809+0100 SpeechTestCatalina[3623:82404] Sleeping
2020-01-26 17:48:41.182459+0100 SpeechTestCatalina[3623:82811] Status: 3
2020-01-26 17:48:41.182562+0100 SpeechTestCatalina[3623:82811] Authorized
2020-01-26 17:48:41.186933+0100 SpeechTestCatalina[3623:82811] Locale en, US
2020-01-26 17:48:41.190973+0100 SpeechTestCatalina[3623:82811] Available 0
2020-01-26 17:48:41.191269+0100 SpeechTestCatalina[3623:82811] Auth status 3
2020-01-26 17:48:41.197965+0100 SpeechTestCatalina[3623:82811] Supports on device 0
2020-01-26 17:48:41.198065+0100 SpeechTestCatalina[3623:82811] speechRecognizer is not available on this device
Program ended with exit code: 0
user1573546
  • 523
  • 5
  • 13
  • Does this answer your question? [How to make SFSpeechRecognizer available on macOS?](https://stackoverflow.com/questions/59111644/how-to-make-sfspeechrecognizer-available-on-macos) – TheNextman Jan 27 '20 at 06:31
  • I don't think so. I had read this post before posting my question but the user reported a different problem, i.e. the authorization dialog not appearing, which works fine in my case. Also I do not understand what I could use as a delegate in my simple main program example and what mechanism would make it work then. Do you have an answer to the latter question? – user1573546 Jan 27 '20 at 09:50
  • I tried your code on my machine and it worked (speech recognition is available). I disabled "Enable Ask Siri" in System Preferences > Siri, and tried again, and now speech recognition is not available. Can you try toggling that setting and see if something changes? – TheNextman Jan 27 '20 at 20:43
  • You are right. That makes a difference. In addition to that I realized that I hadn't enabled codesigning in the project and that and enabling Siri got me one step further. However, now I get 0 for speechRecognizer.supportsOnDeviceRecognition, which I would not expect for locale "en-US" and if I remove the check for that and remove the line that sets requiresOnDeviceRecognition to 1, I now get these errors ```AddInstanceForFactory: No factory registered for id F8BB1C28-BAE8-11D6-9C31-00039315CD46 HALC_ShellDriverPlugIn::Open: Can't get a pointer to the Open routine ``` – user1573546 Jan 28 '20 at 08:39
  • There was a bug in iOS 13: `The supportsOnDeviceRecognition property always returns false the first time it’s accessed. After a few seconds, accessing it again returns the correct value.`. Maybe the same thing? – TheNextman Jan 28 '20 at 16:32
  • Does the recognition actually work in spite of the error on your console? – TheNextman Jan 28 '20 at 16:32
  • OK, this is funny, suddenly, without changing any code, the supportsOnDeviceRecognition property is 1. However, the result is still the same. Recognition only emits these same errors and the result callback passed to recognitionTaskWithRequest is never called. When I google these errors I get audio-related results. However the file plays back fine in Finder and Quick Look and I triple-checked the path. – user1573546 Jan 29 '20 at 11:20
  • I also tried different file formats, AAV in m4a, AIFF and all give the same results. Am I maybe lacking some sort of global audio initialization code? – user1573546 Jan 29 '20 at 13:13

2 Answers2

3

You aren't getting the callback because your binary does not have a runloop. I'll take the response from this different question but with the same answer:

Callbacks in most Apple frameworks are delivered through your application's main run loop. If your command-line tool does not have a run loop, it cannot receive callbacks that are sent this way.

Without a runloop, the only way for the framework to invoke your callback would be to run it on another thread, which could lead to weird behaviour in an application that didn't expect that.

You can manually pump the runloop by inserting this code before the end of main:

NSRunLoop* runloop = [NSRunLoop currentRunLoop];
[runloop runUntilDate:[NSDate distantFuture]];

This will prevent your application from exiting; you'll need to update your logic to know when speech recognition is finished and restructure that with a while loop or something - but I assume the logic inside your "real" application is different than this toy sample.


The message:

AddInstanceForFactory: No factory registered for id F8BB1C28-BAE8-11D6-9C31-00039315CD46 HALC_ShellDriverPlugIn::Open: Can't get a pointer to the Open routine

that appears in your console a meaningless; it's some log statement leaking out of the system frameworks and you can disregard it.


Finally, for clarification on a couple other points:

  • "Enable Ask Siri" was required to be enabled in System Preferences > Siri for speech recognition to be available
  • There is a potential issue where the device may report that "on device recognition" is not available the first time you check, despite being supported for the chosen locale
TheNextman
  • 12,428
  • 2
  • 36
  • 75
  • Thanks you so much! This was spot on. Recognition works now as expected after I added the run loop despite the messages you told me to ignore, which are still there. The two points you added for clarification are also correct. For the latter point I am not 100% sure if a reboot fixed that (I certainly did one before the first successful run) or it just started to work at some point. And yes, the real code will look much different, now that I have a working proof of concept. – user1573546 Jan 30 '20 at 12:00
  • In my case it was 'Ask Siri' being disabled. – NSGodMode Sep 05 '21 at 15:46
0

I solve this issue by checking the option "Audio Input" in the "Signing & Capabilities" in the target settings.

In macOS and in iOS is necessary to ask for Speech Recognition permission to the user and access to the microphone too. But in macOS this is made in a different way than iOS.

xcode settings target capabilities

veladan
  • 365
  • 1
  • 4
  • 11