AVSpeechSynthesizer output as file?

Question

AVSpeechSynthesizer has a fairly simple API, which doesn't have support for saving to an audio file built-in.

I'm wondering if there's a way around this - perhaps recording the output as it's played silently, for playback later? Or something more efficient.

score 20 · Answer 1 · answered Sep 26 '19 at 14:04

20

This is finally possible, in iOS 13 AVSpeechSynthesizer now has write(_:toBufferCallback:):

let synthesizer = AVSpeechSynthesizer()
let utterance = AVSpeechUtterance(string: "test 123")
utterance.voice = AVSpeechSynthesisVoice(language: "en")
var output: AVAudioFile?

synthesizer.write(utterance) { (buffer: AVAudioBuffer) in
   guard let pcmBuffer = buffer as? AVAudioPCMBuffer else {
      fatalError("unknown buffer type: \(buffer)")
   }
   if pcmBuffer.frameLength == 0 {
     // done
   } else {
     // append buffer to file
     if output == nil { 
       output = AVAudioFile(
         forWriting: URL(fileURLWithPath: "test.caf"), 
         settings: pcmBuffer.format.settings, 
         commonFormat: .pcmFormatInt16, 
         interleaved: false) 
     }
     output?.write(from: pcmBuffer)
   } 
}

answered Sep 26 '19 at 14:04

Jan Berkel

3,373
1
30
23

8

When executing under Mac OS 10.15 the callback is never being called. The synthesizer just pronounces the text instead. – vasily Mar 10 '20 at 09:54
2

I just bumped into that on 10.15 too, any change/workaround @vasily ? oh well back to NSSpeechSynthesizer ‍♂️ – glotcha Apr 27 '20 at 04:46
@glotcha No resolution so far – vasily Apr 28 '20 at 05:02
This is still a bug in macOS. I've raised a bug report via Feedback Assistant. – Pete Sep 23 '20 at 18:17
I've raised this now as a DTS incident with Apple, they're investigating. – Pete Oct 15 '20 at 07:38
With Big Sur 11.2 Beta, the callbacks are now being received. However, there is only ever one callback; and the frameLength for that callback is never zero! I've fed this information back to Apple. – Pete Jan 20 '21 at 09:53
I've today received the following response from Apple: "On macOS, there is only one callback made with all the audio data. You can know when the speech is finished, because the didFinishSpeaking callback is called when processing is done." – Pete Jan 21 '21 at 21:32
With BigSur 11.1.x, the int32 audio data is received, but would appear to be an approximate square wave; you can hear the audio, but is is very heavily distorted. Raised with DTS... – Pete Jan 22 '21 at 13:51
any solution yet? I am using Xamarin.IOS and the same happens, the App crashes in "WriteUtterance" and the callback is never executed. 1.5 years have passed... Has at least someone found a workaround? – Jaime Santos Nov 22 '21 at 17:56
Can't play the output file use AVAudioPlayer – Purkylin Jan 22 '22 at 00:39
For me on macOS 12.2 the callback is still never being called. :/ – Daniel Feb 09 '22 at 19:12
3

Note that the synthesizer instance should not be created locally in the func. It should either be a member of a class or global, otherwise it is released before having the chance to start synthesizing and the callback is never called. – Vladimir Grigorov Apr 21 '22 at 08:13

Bhumit Mehta · Accepted Answer · 2014-09-23T07:34:45.840

As of now AVSpeechSynthesizer does not support this . There in no way get the audio file using AVSpeechSynthesizer . I tried this few weeks ago for one of my apps and found out that it is not possible , Also nothing has changed for AVSpeechSynthesizer in iOS 8.

I too thought of recording the sound as it is being played , but there are so many flaws with that approach like user might be using headphones, the system sound might be low or mute , it might catch other external sound, so its not advisable to go with that approach.

score 1 · Answer 3 · answered Oct 14 '14 at 14:36

1

You can use OSX to prepare AIFF files (or, maybe, some OSX-based service) via NSSpeechSynthesizer method startSpeakingString:toURL:

answered Oct 14 '14 at 14:36

deksden

774
6
13

AVSpeechSynthesizer output as file?

3 Answers3

Linked