Realtime Audio with AVAudioEngine

Question

Hej. I want to implement a realtime audio application with the new AVAudioEngine in Swift. Has someone experience with the new framework? How does real time applications work?

My first idea was to store the (processed) input data into a AVAudioPCMBuffer object and then let it play by an AVAudioPlayerNode as you can see in my demo class:

import AVFoundation

class AudioIO {
    var audioEngine: AVAudioEngine
    var audioInputNode : AVAudioInputNode
    var audioPlayerNode: AVAudioPlayerNode
    var audioMixerNode: AVAudioMixerNode
    var audioBuffer: AVAudioPCMBuffer

    init(){
        audioEngine = AVAudioEngine()
        audioPlayerNode = AVAudioPlayerNode()
        audioMixerNode = audioEngine.mainMixerNode

        let frameLength = UInt32(256)
        audioBuffer = AVAudioPCMBuffer(PCMFormat: audioPlayerNode.outputFormatForBus(0), frameCapacity: frameLength)
        audioBuffer.frameLength = frameLength

        audioInputNode = audioEngine.inputNode

        audioInputNode.installTapOnBus(0, bufferSize:frameLength, format: audioInputNode.outputFormatForBus(0), block: {(buffer, time) in
            let channels = UnsafeArray(start: buffer.floatChannelData, length: Int(buffer.format.channelCount))
            let floats = UnsafeArray(start: channels[0], length: Int(buffer.frameLength))

            for var i = 0; i < Int(self.audioBuffer.frameLength); i+=Int(self.audioMixerNode.outputFormatForBus(0).channelCount)
            {
                // doing my real time stuff
                self.audioBuffer.floatChannelData.memory[i] = floats[i];
            }
            })

        // setup audio engine
        audioEngine.attachNode(audioPlayerNode)
        audioEngine.connect(audioPlayerNode, to: audioMixerNode, format: audioPlayerNode.outputFormatForBus(0))
        audioEngine.startAndReturnError(nil)

        // play player and buffer
        audioPlayerNode.play()
        audioPlayerNode.scheduleBuffer(audioBuffer, atTime: nil, options: .Loops, completionHandler: nil)
    }
}

But this is far away from real time and not very efficient. Any ideas or experiences? And it does not matter, if you prefer Objective-C or Swift, I am grateful for all notes, remarks, comments, solutions, etc.

Objective-C is not recommended for real-time programming. I'm not aware of Apple taking an official position on real-time programming in Swift yet, but there was some discussion on http://prod.lists.apple.com/archives/coreaudio-api/2014/Jun/msg00002.html — sbooth, Jun 24 '14 at 12:02
Thank you for the link, but the essential from this discussion until now: no one knows anything. ;-) But the question is rather about the new programming language or if Objective-C is able to process in realtime, then how can I use the AVAudioEngine for real time applications, which is advertised by Apple in its WWDC14 session no. 502. — Michael Dorner, Jun 24 '14 at 14:39
Objective-C can be used for writing real-time audio apps, but there are restrictions on what can be done inside Core Audio's `IOProcs`. For example, no memory allocation, no locks, no Objective-C method calls, etc. See http://www.rossbencina.com/code/real-time-audio-programming-101-time-waits-for-nothing I imagine that internally `AVAudioEngine` uses only C inside the realtime methods, and I also bet that the taps have the same restrictions as `IOProcs`. — sbooth, Jun 24 '14 at 22:19
Michael, for buffer taps I would suggest to use simple and plain C. Swift and ObjC both introduce an unpredictable overhead because of ARC, internal locks, and memory allocations. C is best used to process buffers. When it comes to feed data to the main thread for display, use lock-free circular buffers and ObjC. But why are you copying the input buffer yourself? You can connect `AVAudioEngine.inputNode` directly to `AVAudioEngine.outputNode`. — bio, Jun 18 '16 at 19:07
By "real-time" do you mean recording, and doing stuff like drawing am waveform of the microphone's signal, or feeding the captured audio to a speech recognizer on the fly? If so, let me know, and I will post my code as an answer. — Josh, Feb 09 '17 at 09:27
The second one, signal processing in real-time. Thanks in advance. — Michael Dorner, Feb 09 '17 at 09:41

score 12 · Answer 1 · edited May 23 '17 at 12:32

12

I've been experimenting with AVAudioEngine in both Objective-C and Swift. In the Objective-C version of my engine, all audio processing is done purely in C (by caching the raw C sample pointers available through AVAudioPCMBuffer, and operating on the data with only C code). The performance is impressive. Out of curiosity, I ported this engine to Swift. With tasks like playing an audio file linearly, or generating tones via FM synthesis, the performance is quite good, but as soon as arrays are involved (e.g. with granular synthesis, where sections of audio are played back and manipulated in a non-linear fashion), there is a significant performance hit. Even with the best optimization, CPU usage is 30-40% greater than with the Objective-C/C version. I'm new to Swift, so perhaps there are other optimizations of which I am ignorant, but as far as I can tell, C/C++ are still the best choice for realtime audio. Also look at The Amazing Audio Engine. I'm considering this, as well as direct use of the older C API.

If you need to process live audio, then AVAudioEngine may not be for you. See my answer to this question: I want to call 20 times per second the installTapOnBus:bufferSize:format:block:

edited May 23 '17 at 12:32

Community

1
1

answered Oct 28 '14 at 13:29

Jason McClinsey

426
5
12

But: Actually my question was not related to the AVAudioEngine framework itself, not to the Objective-C/Swift. But of course, it is closely connected. – Michael Dorner Oct 28 '14 at 20:03
1

Michael Dorner, since you said that I didn't answer your question, and it seems to me that I did, perhaps if you rephrased it, I could add some additional info that could useful. I've been working through similar problems in my free time (experimenting with AVAudioEngine with higher-level languages), and am interested in sharing what I've learned. – Jason McClinsey Oct 28 '14 at 20:30
1

@JasonMcClinsey: Would you be willing to post some code that demonstrates how you are using C and AVAudioPCMBuffer to do FM synthesis? (There is a new class called AVAudioUnitGenerator, which sounds promising, but the documentation is thin and says it is an API 'in development'.) – RonH Apr 28 '15 at 09:42
@JasonMcClinsey: I should have been more specific: I'm looking for Swift code. – RonH Apr 28 '15 at 09:58
@JasonMcClinsey I'm very confused.. You had AVAudioEngine running just fine in Objective-C - you ported it to Swift for some reason. You then saw the performance go way down.. And you blamed.. AVAudioEngine? Sounds like the Swift port was 100% to blame. – Roy Lovejoy May 19 '20 at 02:14
@RoyLovejoy, please read my comment again, as I did not blame AVAudioEngine. – Jason McClinsey May 21 '20 at 16:38

hungrxyz · Answer 2 · 2015-07-30T12:33:29.590

7

I think this does what you want: https://github.com/arielelkin/SwiftyAudio

Comment out the distortions and you have a clean loop.

edited Jul 30 '15 at 12:33

answered Jun 15 '15 at 15:12

hungrxyz

745
11
20

I am getting stream audio buffer through socket connection. How to play that buffers with Audio Engine? – Bhuvanendra Pratap Maurya Sep 12 '19 at 13:09

hotpaw2 · Answer 3 · 2020-05-15T18:43:35.010

5

Apple has taken an official position on real-time coding in Swift. In the 2017 WWDC session on Core Audio, an Apple engineer said not to use either Swift or Objective C methods inside the real-time audio callback context (implying use only C, or maybe C++ or assembly code).

This is because use of Swift and Objective C methods can involve internal memory management operations that do not have bounded latency.

This implies the proper scheme might be to use Objective C for your AVAudioEngine controller class, but only the C subset (of Objective C) inside the tapOnBus block.

edited May 15 '20 at 18:43

answered Jul 29 '17 at 00:37

hotpaw2

70,107
14
90
153

2

WWDC video on Apple's current site is https://developer.apple.com/videos/play/wwdc2017/501/?time=1268 ("Now here comes the actual render logic. And note that this part of the code is written in C++, and that is because as I mentioned it's not safe to use Objective-C or Swift runtime from a real-time context.") [update: I see @LloydRochester's quoted a similar section too.] – natevw Dec 06 '19 at 19:19

score 2 · Answer 4 · answered Jul 25 '18 at 19:50

According to Apple at the WWDC2017 in the What's New in Core Audio Video at approximately 19:20 the Engineer says using Swift is "not safe" from a real-time context. Posted from the transcript.

Now, because the rendering of the Engine happens from a real-time context, you will not be able to use the render offline Objective-C or Swift meta that we saw in the demo.

And that is because, it is not safe to use Objective-C or Swift runtime from a real-time context. So, instead, the engine itself provides you a render block that you can search and cache, and then later use this render block to render the engine from the real-time context. The next thing is -- to do, is to set up your input node so that you can provide your input data to the Engine. And here, you specify the format of the input that you will provide, and this can be a different format than the output. And you also provide a block which the Engine will call, whenever it needs the input data. And when this block gets called, the Engine will let you know how many number of input frames it actually needs.

Realtime Audio with AVAudioEngine

4 Answers4