8

How can I extrace Audio from Video file without using FFmpeg?

I want to use AVMutableComposition and AVURLAsset for solving it.e.g. conversion from .mov to .m4a file.

Tripti Kumar
  • 1,559
  • 14
  • 28
  • 1
    As far as I know, AVFoundation on iOS knows *nothing* about how to decode or open "`.flv`" files so this question is a non-starter to begin with. That's why you need to use some third party library such as ffmpeg or something else, in order to open a .flv file and convert it to something you can make proper use of. – Michael Dautermann Jul 12 '12 at 12:22
  • I'll edit the example as i did not think about this point...but my requirement is still the same.. :( – Tripti Kumar Jul 12 '12 at 12:25

2 Answers2

14

The following Swift 5 / iOS 12.3 code shows how to extract audio from a movie file (.mov) and convert it to an audio file (.m4a) by using AVURLAsset, AVMutableComposition and AVAssetExportSession:

import UIKit
import AVFoundation

class ViewController: UIViewController {

    @IBAction func extractAudioAndExport(_ sender: UIButton) {
        // Create a composition
        let composition = AVMutableComposition()
        do {
            let sourceUrl = Bundle.main.url(forResource: "Movie", withExtension: "mov")!
            let asset = AVURLAsset(url: sourceUrl)
            guard let audioAssetTrack = asset.tracks(withMediaType: AVMediaType.audio).first else { return }
            guard let audioCompositionTrack = composition.addMutableTrack(withMediaType: AVMediaType.audio, preferredTrackID: kCMPersistentTrackID_Invalid) else { return }
            try audioCompositionTrack.insertTimeRange(audioAssetTrack.timeRange, of: audioAssetTrack, at: CMTime.zero)
        } catch {
            print(error)
        }

        // Get url for output
        let outputUrl = URL(fileURLWithPath: NSTemporaryDirectory() + "out.m4a")
        if FileManager.default.fileExists(atPath: outputUrl.path) {
            try? FileManager.default.removeItem(atPath: outputUrl.path)
        }

        // Create an export session
        let exportSession = AVAssetExportSession(asset: composition, presetName: AVAssetExportPresetPassthrough)!
        exportSession.outputFileType = AVFileType.m4a
        exportSession.outputURL = outputUrl

        // Export file
        exportSession.exportAsynchronously {
            guard case exportSession.status = AVAssetExportSession.Status.completed else { return }

            DispatchQueue.main.async {
                // Present a UIActivityViewController to share audio file
                guard let outputURL = exportSession.outputURL else { return }
                let activityViewController = UIActivityViewController(activityItems: [outputURL], applicationActivities: [])
                self.present(activityViewController, animated: true, completion: nil)
            }
        }
    }

}
Imanou Petit
  • 89,880
  • 29
  • 256
  • 218
4

In all multimedia formats, audio is encoded separately from video, and their frames are interleaved in the file. So removing the video from a multimedia file does not require any messing with encoders and decoders: you can write a file format parser that will drop the video track, without using the multimedia APIs on the phone.

To do this without using a 3rd party library, you need to write the parser from scratch, which could be simple or difficult depending on the file format you wish to use. For example, FLV is very simple so stripping a track out of it is very easy (just go over the stream, detect the frame beginnings and drop the '0x09'=video frames). MP4 a bit more complex, its header (MOOV) has a hierarchical structure in which you have headers for each of the tracks (TRAK atoms). You need to drop the video TRAK, and then copy the interleaved bitstream atom (MDAT) skipping all the video data clusters as you copy.

There are 3rd party libraries you can use, aside from ffmpeg. One that comes in mind is GPAC MP4BOX (LGPL license). If the LGPL is a problem, there are plenty of commercial SDKs that you can use.

onon15
  • 3,620
  • 1
  • 18
  • 22
  • Thanks for your answer.. +1 for it.. but if you could help me with the coding part..that would be a great help :) – Tripti Kumar Jul 24 '12 at 08:54
  • Sorry... MOV (similar to MP4) is a complicated file format, writing such parser is at least a day or two of coding, so I can't help you with it. I guess your original idea of trying to do that with AVMutableComposition is a better way to go (it should do just the same) - basically an M4A file is almost similar to MOV without the audio track, so opening the MOV as AVMutableComposition and doing removeTrack might do the trick... – onon15 Jul 24 '12 at 14:06
  • @onon15 - (+1) I have a file with 1 Audio TRAK & 1 video TRAK (aac, h264), how can i distinct between the samples in the 'mdat' atom? Thanks! – Avishay Cohen Aug 08 '12 at 14:51
  • It's not as easy as you'd expect, but not hard once you get the hang of it. **You can't get it from the MDAT itself**. The offsets of the data chunks belonging to each track are stored in the `STCO` table (or `CO64`) inside `TRAK>MDIA>MINF>STBL`. The length of each chunk is another calculation you need to do on info in the `STBL`. See [this reference](http://wiki.multimedia.cx/index.php?title=QuickTime_container#stco) – onon15 Aug 09 '12 at 11:03