4

The goal here is to create a mp4 file via video through AVCaptureDataOutput and audio recorded a CoreAudio. Then send the CMSampleBuffers of both to an AVAssetWriter who has accompanying AVAssetWriterInput(AVMediaTypeVideo) and AVAssetWriterInput(AVMediaTypeAudio)

My audio encoder the copies the AudioBuffer to a new CMSampleBuffer the passes it to the AVAssetWriterInput(AVMediaTypeAudio). This example is how the conversion of AudioBuffer to CMSampleBuffer is done. Converstion to CMSampleBuffer

Long story short, it does not work. The video shows up but no audio.

BUT, if I comment out the video encoding, then the audio is written to the file and audible.

That tells me from experience it is a timing issue. The Converstion to CMSampleBuffer does show

   CMSampleTimingInfo timing = { CMTimeMake(1, 44100.0), kCMTimeZero, kCMTimeInvalid };

It produces a time CMTimeCopyDescription of {0/1 = 0.000} which seems completely wrong to me. I tried keeping track of the frames rendered and passing the framecount for the time value and the samplerate for the time scale like this

   CMSampleTimingInfo timing = { CMTimeMake(1, 44100.0), CMTimeMake(self.frameCount, 44100.0), kCMTimeInvalid };

But no dice. A nicer looking CMSampleTimingInfo {107520/44100 = 2.438}, but still no audio in the file.

The video CMSampleBuffer produces something like this {65792640630624/1000000000 = 65792.641, rounded}. This tells me the AVCaptureVideoOutput has a time scale of 1 billion, likely nanoseconds. And I guest the time value is the something like the device time. I cant find any info about what AVCaptureVideoOutput uses.

Anyone have any helpful guidance? Am I even on the right track?

Heres the Conversion

    CMSampleBufferRef buff = malloc(sizeof(CMSampleBufferRef));
    CMFormatDescriptionRef format = NULL;

    self.frameCount += inNumberFrames;

    CMTime presentationTime = CMTimeMake(self.frameCount, self.pcmASBD.mSampleRate);

    AudioStreamBasicDescription audioFormat = self.pcmASBD;
    CheckError(CMAudioFormatDescriptionCreate(kCFAllocatorDefault,
                                              &audioFormat,
                                              0,
                                              NULL,
                                              0,
                                              NULL,
                                              NULL,
                                              &format),
               "Could not create format from AudioStreamBasicDescription");

    CMSampleTimingInfo timing = { CMTimeMake(1, self.pcmASBD.mSampleRate), presentationTime, kCMTimeInvalid };

    CheckError(CMSampleBufferCreate(kCFAllocatorDefault,
                                    NULL,
                                    false,
                                    NULL,
                                    NULL,
                                    format,
                                    (CMItemCount)inNumberFrames,
                                    1,
                                    &timing,
                                    0,
                                    NULL,
                                    &buff),
               "Could not create CMSampleBufferRef");

    CheckError(CMSampleBufferSetDataBufferFromAudioBufferList(buff,
                                                              kCFAllocatorDefault,
                                                              kCFAllocatorDefault,
                                                              0,
                                                              audioBufferList),
               "Could not set data in CMSampleBufferRef");

    [self.delegate didRenderAudioSampleBuffer:buff];

    CFRelease(buff);

And the assetWriters I create

    func createVideoInputWriter()->AVAssetWriterInput? {
        let numPixels                               = Int(self.size.width * self.size.height)
        let bitsPerPixel:Int                        = 11
        let bitRate                                 = Int64(numPixels * bitsPerPixel)
        let fps:Int                                 = 30
        let settings:[NSObject : AnyObject]         = [
            AVVideoCodecKey                         : AVVideoCodecH264,
            AVVideoWidthKey                         : self.size.width,
            AVVideoHeightKey                        : self.size.height,
            AVVideoCompressionPropertiesKey         : [
                AVVideoAverageBitRateKey            : NSNumber(longLong: bitRate),
                AVVideoMaxKeyFrameIntervalKey       : NSNumber(integer: fps)
            ]
        ]

        var assetWriter:AVAssetWriterInput!
        if self.mainAssetWriter.canApplyOutputSettings(settings, forMediaType:AVMediaTypeVideo) {
            assetWriter                             = AVAssetWriterInput(mediaType:AVMediaTypeVideo, outputSettings:settings)
            assetWriter.expectsMediaDataInRealTime  = true
            if self.mainAssetWriter.canAddInput(assetWriter) {
                self.mainAssetWriter.addInput(assetWriter)
            }
        }
        return assetWriter;
    }

    func createAudioInputWriter()->AVAssetWriterInput? {
        let settings:[NSObject : AnyObject]         = [
            AVFormatIDKey                           : kAudioFormatMPEG4AAC,
            AVNumberOfChannelsKey                   : 2,
            AVSampleRateKey                         : 44100,
            AVEncoderBitRateKey                     : 64000
        ]

        var assetWriter:AVAssetWriterInput!
        if self.mainAssetWriter.canApplyOutputSettings(settings, forMediaType:AVMediaTypeAudio) {
            assetWriter                             = AVAssetWriterInput(mediaType:AVMediaTypeAudio, outputSettings:settings)
            assetWriter.expectsMediaDataInRealTime  = true
            if self.mainAssetWriter.canAddInput(assetWriter) {
                self.mainAssetWriter.addInput(assetWriter)
            } else {
                let error = NSError(domain:CMHDFileEncoder.Domain, code:CMHDFileEncoderErrorCode.CantAddInput.rawValue, userInfo:nil)
                self.errorDelegate.hdFileEncoderError(error)
            }
        } else {
            let error = NSError(domain:CMHDFileEncoder.Domain, code:CMHDFileEncoderErrorCode.CantApplyOutputSettings.rawValue, userInfo:nil)
            self.errorDelegate.hdFileEncoderError(error)
        }
        return assetWriter
    }
Community
  • 1
  • 1
mylegfeelsfunny
  • 507
  • 1
  • 8
  • 20

1 Answers1

3

Of course, had the problem for 2 weeks, posted the question on a friday night, and found the solution monday morning.

The research I came across with put me on the right track...

1000000000 timescale is for nano seconds. But the timevalue has to be nanoseconds of the devices absolute time.

This post explains better than I can - mach time

I ended up using this code to fix it

    CMSampleBufferRef buff = malloc(sizeof(CMSampleBufferRef));
    CMFormatDescriptionRef format = NULL;

    AudioStreamBasicDescription audioFormat = self.pcmASBD;
    CheckError(CMAudioFormatDescriptionCreate(kCFAllocatorDefault,
                                              &audioFormat,
                                              0,
                                              NULL,
                                              0,
                                              NULL,
                                              NULL,
                                              &format),
               "Could not create format from AudioStreamBasicDescription");

    uint64_t time = inTimeStamp->mHostTime;
    /* Convert to nanoseconds */
    time *= info.numer;
    time /= info.denom;
    CMTime presentationTime                 = CMTimeMake(time, kDeviceTimeScale);
    CMSampleTimingInfo timing               = { CMTimeMake(1, self.pcmASBD.mSampleRate), presentationTime, kCMTimeInvalid };

    CheckError(CMSampleBufferCreate(kCFAllocatorDefault,
                                    NULL,
                                    false,
                                    NULL,
                                    NULL,
                                    format,
                                    (CMItemCount)inNumberFrames,
                                    1,
                                    &timing,
                                    0,
                                    NULL,
                                    &buff),
               "Could not create CMSampleBufferRef");

    CheckError(CMSampleBufferSetDataBufferFromAudioBufferList(buff,
                                                              kCFAllocatorDefault,
                                                              kCFAllocatorDefault,
                                                              0,
                                                              audioBufferList),
               "Could not set data in CMSampleBufferRef");
Community
  • 1
  • 1
mylegfeelsfunny
  • 507
  • 1
  • 8
  • 20
  • Hello, thanks for this solution, can you explain me what is inTimeStamp? and info? – Pablo Martinez Jan 28 '16 at 11:32
  • Hey Pablo, the `inTimeStamp` is the timestamp associated with the samplebuffer from the callback function set on the audio unit. It gets assigned via the an `AURenderCallbackStruct`. I highly recommend [Learning Core Audio](http://www.amazon.com/Learning-Core-Audio-Hands-On-Programming/dp/0321636848) if you want to learn more. – mylegfeelsfunny Jan 28 '16 at 16:04
  • Thanks! The problem is, that I get the AudioBufferList from a streaming service. Do you know how can I do this? – Pablo Martinez Jan 28 '16 at 16:07
  • Offhand I am not sure, the [AURenderCallback](https://developer.apple.com/library/mac/documentation/AudioUnit/Reference/AUComponentServicesReference/#//apple_ref/c/tdef/AURenderCallback) sends me the `AudioBufferList` and the `AudioTimeStamp`. But I would look into populating that timestamp yourself, then passing it forward if you can. That would change the time but if the change is always consistent it could work. Bear in mind I am speculating at this point. – mylegfeelsfunny Jan 28 '16 at 16:17
  • Hi, I know this is an old thread but I'm doing the same thing as you do but my presentation timestamp of thee `CMSampleBuffer` created from the audio and the timestamp of the `CMSampleBuffer` coming in from the video in `AVCaptureSession` are really different, did you also had this issue? – YYfim Jun 24 '21 at 06:57