5

This is my first attempt at using CoreAudio, but my goal is to capture microphone data, resample it to a new sample rate, and then capture the raw 16-bit PCM data.

My strategy for this is to make an AUGraph with the microphone --> a sample rate converter, and then have a callback that gets data from the output of the converter (which I'm hoping is mic output at the new sample rate?).

Right now my callback just fires with a null AudioBufferList*, which obviously isn't correct. How should I set this up and what am I doing wrong?

Code follows:

  CheckError(NewAUGraph(&audioGraph), @"Creating graph");
  CheckError(AUGraphOpen(audioGraph), @"Opening graph");

  AUNode micNode, converterNode;
  AudioUnit micUnit, converterUnit;

  makeMic(&audioGraph, &micNode, &micUnit);

  // get the Input/inputBus's stream description
  UInt32 sizeASBD = sizeof(AudioStreamBasicDescription);
  AudioStreamBasicDescription hwASBDin;
  AudioUnitGetProperty(micUnit,
                       kAudioUnitProperty_StreamFormat,
                       kAudioUnitScope_Input,
                       kInputBus,
                       &hwASBDin,
                       &sizeASBD);
  makeConverter(&audioGraph, &converterNode, &converterUnit, hwASBDin);

  // connect mic output to converterNode
  CheckError(AUGraphConnectNodeInput(audioGraph, micNode, 1, converterNode, 0),
             @"Connecting mic to converter");

  // set callback on the output? maybe?
  AURenderCallbackStruct callbackStruct;
  callbackStruct.inputProc = audioCallback;
  callbackStruct.inputProcRefCon = (__bridge void*)self;
  CheckError(AudioUnitSetProperty(micUnit,
                                kAudioOutputUnitProperty_SetInputCallback,
                                kAudioUnitScope_Global,
                                kInputBus,
                                &callbackStruct,
                                sizeof(callbackStruct)),
             @"Setting callback");

  CheckError(AUGraphInitialize(audioGraph), @"AUGraphInitialize");

  // activate audio session
  NSError *err = nil;
  AVAudioSession *audioSession = [AVAudioSession sharedInstance];
  if (![audioSession setActive:YES error:&err]){
    [self error:[NSString stringWithFormat:@"Couldn't activate audio session: %@", err]];
  }
  CheckError(AUGraphStart(audioGraph), @"AUGraphStart");

and:

void makeMic(AUGraph *graph, AUNode *micNode, AudioUnit *micUnit) {
  AudioComponentDescription inputDesc;
  inputDesc.componentType = kAudioUnitType_Output;
  inputDesc.componentSubType = kAudioUnitSubType_VoiceProcessingIO;
  inputDesc.componentFlags = 0;
  inputDesc.componentFlagsMask = 0;
  inputDesc.componentManufacturer = kAudioUnitManufacturer_Apple;

  CheckError(AUGraphAddNode(*graph, &inputDesc, micNode),
             @"Adding mic node");

  CheckError(AUGraphNodeInfo(*graph, *micNode, 0, micUnit),
             @"Getting mic unit");

  // enable microphone for recording
  UInt32 flagOn = 1; // enable value
  CheckError(AudioUnitSetProperty(*micUnit,
                                  kAudioOutputUnitProperty_EnableIO,
                                  kAudioUnitScope_Input,
                                  kInputBus,
                                  &flagOn,
                                  sizeof(flagOn)),
             @"Enabling microphone");
}

and:

void makeConverter(AUGraph *graph, AUNode *converterNode, AudioUnit *converterUnit, AudioStreamBasicDescription inFormat) {
  AudioComponentDescription sampleConverterDesc;
  sampleConverterDesc.componentType = kAudioUnitType_FormatConverter;
  sampleConverterDesc.componentSubType = kAudioUnitSubType_AUConverter;
  sampleConverterDesc.componentFlags = 0;
  sampleConverterDesc.componentFlagsMask = 0;
  sampleConverterDesc.componentManufacturer = kAudioUnitManufacturer_Apple;

  CheckError(AUGraphAddNode(*graph, &sampleConverterDesc, converterNode),
             @"Adding converter node");
  CheckError(AUGraphNodeInfo(*graph, *converterNode, 0, converterUnit),
             @"Getting converter unit");

  // describe desired output format
  AudioStreamBasicDescription convertedFormat;
  convertedFormat.mSampleRate           = 16000.0;
  convertedFormat.mFormatID         = kAudioFormatLinearPCM;
  convertedFormat.mFormatFlags      = kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked;
  convertedFormat.mFramesPerPacket  = 1;
  convertedFormat.mChannelsPerFrame = 1;
  convertedFormat.mBitsPerChannel       = 16;
  convertedFormat.mBytesPerPacket       = 2;
  convertedFormat.mBytesPerFrame        = 2;

  // set format descriptions
  CheckError(AudioUnitSetProperty(*converterUnit,
                                  kAudioUnitProperty_StreamFormat,
                                  kAudioUnitScope_Input,
                                  0, // should be the only bus #
                                  &inFormat,
                                  sizeof(inFormat)),
             @"Setting format of converter input");
  CheckError(AudioUnitSetProperty(*converterUnit,
                                  kAudioUnitProperty_StreamFormat,
                                  kAudioUnitScope_Output,
                                  0, // should be the only bus #
                                  &convertedFormat,
                                  sizeof(convertedFormat)),
             @"Setting format of converter output");
}
user358829
  • 741
  • 1
  • 7
  • 17
  • 1
    Does this need to happen live? It's a lot easier to capture the audio to a file. Are you set on using the C APIs? AVAudioEngine can do quite a bit otherwise. – dave234 Jan 20 '17 at 07:10
  • it does need to happen live, but no preference on which set of APIs I use – user358829 Jan 20 '17 at 19:18
  • I checked AVAudioEngine, it looks like sample rate conversions are limited to certain sample rates for some reason. I guess the c APIs are necessary for odd sample rates. – dave234 Jan 20 '17 at 21:36

1 Answers1

3

The render callback is used a a source for an audio unit. If you set the kAudioOutputUnitProperty_SetInputCallback property on the remoteIO unit, you must call AudioUnitRender from within the callback you provide, then you would have to manually do the sample rate conversion, which is ugly.

There is an "easier" way. The remoteIO acts as two units, the input (mic) and the output (speaker). Create a graph with a remoteIO, then connect the mic to the speaker, using the desired format. Then you can get the data using a renderNotify callback, which acts as a "tap".

I created a ViewController class to demonstrate

#import "ViewController.h"
#import <AudioToolbox/AudioToolbox.h>
#import <AVFoundation/AVFoundation.h>

@implementation ViewController

- (void)viewDidLoad {
    [super viewDidLoad];

    //Set your audio session to allow recording
    AVAudioSession *audioSession = [AVAudioSession sharedInstance];
    [audioSession setCategory:AVAudioSessionCategoryPlayAndRecord error:NULL];
    [audioSession setActive:1 error:NULL];

    //Create graph and units
    AUGraph graph = NULL;
    NewAUGraph(&graph);

    AUNode ioNode;
    AudioUnit ioUnit = NULL;
    AudioComponentDescription ioDescription = {0};
    ioDescription.componentManufacturer = kAudioUnitManufacturer_Apple;
    ioDescription.componentType         = kAudioUnitType_Output;
    ioDescription.componentSubType      = kAudioUnitSubType_VoiceProcessingIO;

    AUGraphAddNode(graph, &ioDescription, &ioNode);
    AUGraphOpen(graph);
    AUGraphNodeInfo(graph, ioNode, NULL, &ioUnit);

    UInt32 enable = 1;
    AudioUnitSetProperty(ioUnit,kAudioOutputUnitProperty_EnableIO,kAudioUnitScope_Input,1,&enable,sizeof(enable));

    //Set the output of the ioUnit's input bus, and the input of it's output bus to the desired format.
    //Core audio basically has implicite converters that we're taking advantage of.
    AudioStreamBasicDescription asbd = {0};
    asbd.mSampleRate        = 16000.0;
    asbd.mFormatID          = kAudioFormatLinearPCM;
    asbd.mFormatFlags       = kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked;
    asbd.mFramesPerPacket   = 1;
    asbd.mChannelsPerFrame  = 1;
    asbd.mBitsPerChannel    = 16;
    asbd.mBytesPerPacket    = 2;
    asbd.mBytesPerFrame     = 2;

    AudioUnitSetProperty(ioUnit, kAudioUnitProperty_StreamFormat, kAudioUnitScope_Output, 1, &asbd, sizeof(asbd));
    AudioUnitSetProperty(ioUnit, kAudioUnitProperty_StreamFormat, kAudioUnitScope_Input, 0, &asbd, sizeof(asbd));

    //Connect output of the remoteIO's input bus to the input of it's output bus
    AUGraphConnectNodeInput(graph, ioNode, 1, ioNode, 0);

    //Add a render notify with a bridged reference to self (If using ARC)
    AudioUnitAddRenderNotify(ioUnit, renderNotify, (__bridge void *)self);

    //Start graph
    AUGraphInitialize(graph);
    AUGraphStart(graph);
    CAShow(graph);



}
OSStatus renderNotify(void                          *inRefCon,
                      AudioUnitRenderActionFlags    *ioActionFlags,
                      const AudioTimeStamp          *inTimeStamp,
                      UInt32                        inBusNumber,
                      UInt32                        inNumberFrames,
                      AudioBufferList               *ioData){

    //Filter anything that isn't a post render call on the input bus
    if (*ioActionFlags != kAudioUnitRenderAction_PostRender || inBusNumber != 1) {
        return noErr;
    }
    //Get a reference to self
    ViewController *self = (__bridge ViewController *)inRefCon;

    //Do stuff with audio

    //Optionally mute the audio by setting it to zero;
    for (int i = 0; i < ioData->mNumberBuffers; i++) {
        memset(ioData->mBuffers[i].mData, 0, ioData->mBuffers[i].mDataByteSize);
    }
    return noErr;
}


@end
dave234
  • 4,793
  • 1
  • 14
  • 29
  • ok great, thank you so much! one follow-up question: how do I get the 16-bit samples back from the AudioBufferList? right now I'm doing the following: http://pastebin.com/SM11ykf4 I basically just need to shove this data into an NSMutableArray of NSNumbers, but I don't seem to be getting good data out right now. – user358829 Jan 23 '17 at 21:05
  • I think you're going to have to learn some C. It' worth your time if you're dealing with audio. Creating objects for each sample on the render thread will probably not hold up. But to answer the question, you cast the data in ioData->mBuffers[].mData to the desired format. – dave234 Jan 23 '17 at 21:12
  • my C is actually pretty strong, I just need that data to go into an NSMutableArray of NSNumber so I can pass it to React Native. agreed that it's not efficient as-is, but I want to make sure I'm getting good data out before I optimize. casting mData to an SInt16* and walking through it gives me bad data -- in particular, here's a sample: 15275,0,15112,0,-17608,0,-17491,0,-17460,0,-17507,0,-17768,0,15076,0,15178 the alternating 0's make me feel like I'm doing something wrong here... I can't find good documentation on kAudioFormatFlagIsPacked, is that to blame for this? – user358829 Jan 23 '17 at 22:04
  • You're right! The render notify should have been filtering the output bus, it's getting floats as is. I'll edit the answer. – dave234 Jan 23 '17 at 22:20
  • You should be using a circular buffer to get the samples off the render thread before allocating memory. Allocations (and Objc messages) can take locks which will make the audio glitch. – dave234 Jan 23 '17 at 22:23
  • ahh perfect! it works now. thank you so so so much for your help, there is no chance I would have figured this out by myself. – user358829 Jan 23 '17 at 22:33