42

Joining multiple files using ffmpeg concat seems to result in a mismatch of the timestamps or offsets for the audio. I've tried with several videos and noticed the same problem for h.264 / MP4.

Using concat and encoding the video seems to work fine. The audio stays in sync as ffmpeg does the full conversion calculations and seems to get everything right.

However, simply concatenating the videos without any transformation or encoding results in a slowly increasing sync issue. Obviously, encoding the videos rather than simply joining them will result in a loss of information/quality so I would rather find a way around this problem.

I've tried several flags to sort out this problem that appears to be based on the timestamps. None of these seem to correct the problem though.

ffmpeg -f concat -fflags +genpts -async 1 -i segments.txt test.mov
ffmpeg -auto_convert 1 -f concat -fflags +genpts -async 1 -i segments.txt -c copy test2.mov
ffmpeg -f concat -i segments.txt -c copy -fflags +genpts test3.mp4
ffmpeg -f concat -fflags +genpts -async 1 -i segments.txt -copyts test4.mov
ffmpeg -f concat -i segments.txt -copyts test5.mov
ffmpeg -f concat -i segments.txt -copyts -c copy test6.mov
ffmpeg -f concat -fflags +genpts -i segments.txt -copyts -c copy test7.mov

Note: all other questions that I could find on SO seem to "fix" the problem by simply encoding the videos over again. Not a good solution.

Update

I realized the concat wasn't the problem. The original set of clips had mis-matched timestamps. Somehow concat + encoding fixed the issue, but I don't want to re-encode the videos and loose quality each time.

ffmpeg -y -ss 00:00:02.750 -i input.MOV -c copy -t 00:00:05.880 output.MOV

Which resulted in the following data

ffprobe -v quiet -show_entries stream=start_time,duration output.MOV

start_time=-0.247500
duration=6.131125
start_time=-0.257333
duration=6.155333

Since then I've tried to use -tom and -t in different places along with -af apad -c:v copy and I've still failed to get the duration to be the same.

Here is the full ffprobe output

Here is the original (red) vs the segment (green)

Detailed Sample Files

I recorded a sample video, added the commands to chop it up, then concat it. http://davidpennington.me/share/audio_sync_test_video.zip

Xeoncross
  • 55,620
  • 80
  • 262
  • 364
  • 1
    Audio may have to be re-encoded but not [video](http://stackoverflow.com/questions/35397034/ffmpeg-concatenate-videos-with-audio-not-synched/35397602#35397602). You can use `-video_track_timescale` to change video timebase of MOV/MP4s without re-encoding. If you paste details of input files, that will be helpful. – Gyan Feb 15 '16 at 19:27
  • I think it might be related to this [ffmpeg ticket for mp4/aac](https://trac.ffmpeg.org/ticket/3859) – Xeoncross Feb 28 '16 at 21:51
  • [This comment about keyframes](http://stackoverflow.com/a/18449609/99923) might be part of the problem with the audio sync issues. – Xeoncross Feb 28 '16 at 23:05
  • sboisse suggests using [ffprobe to find the nearest keyframes, then clip at that point](http://stackoverflow.com/a/14013439/99923). – Xeoncross Feb 29 '16 at 00:10

5 Answers5

29

This two step process should work

Step 1 Pad out the audio in each segment

ffmpeg -i segment1.mov -af apad -c:v copy <audio encoding params> -shortest -avoid_negative_ts make_zero -fflags +genpts padded1.mov

Or

Generate segments with synced streams

ffmpeg -y -ss 00:00:02.750 -i input.MOV -c copy -t 00:00:05.880 -avoid_negative_ts make_zero -fflags +genpts segment.MOV

Step 2 Concat

ffmpeg -f concat -i segments.txt -c copy test.mov

where segments.txt consists of the names of the padded files.

Gyan
  • 85,394
  • 9
  • 169
  • 201
  • What does "padding the audio" do? How does it fix the timestamps? – Xeoncross Feb 20 '16 at 16:39
  • 1
    The audio in some or all of the segments is not equal to the video length. So the audio joints aren't at the same time as the video joints hence the async. The first step pads the audio i.e. adds indefinite period of silence at the end of each segment, but the shortest stops the operation when the video stream ends thus rendering both audio and video to be the same length (as much as possible). – Gyan Feb 20 '16 at 16:50
  • I tried `-af apad -c:v copy – Xeoncross Feb 21 '16 at 15:52
  • Are these segments from the same file? – Gyan Feb 21 '16 at 16:23
  • Yes, the segments are from the same original source file. – Xeoncross Feb 23 '16 at 18:41
  • 1
    I don't know the codecs of the streams in the file but generally the durations will not match, since both streams are quantized i.e. for a 25 fps video, duration will be multiples of 0.04s, and for AAC audio @ 48 kHz, multiples of 0.0213s. I doubt that's the problem here. Post a ffprobe readout for the whole input and one of the segments you made (before my `apad` suggestion) – Gyan Feb 23 '16 at 19:02
  • Mulvya, I updated the post with the full ffprobe output. Each segment is the same exact codecs. – Xeoncross Feb 28 '16 at 21:09
  • 2
    Your segments have negative PTS because ffmpeg is cutting segments at the keyframe before your split point but is assigning PTS 0 to your split point so frames before have negative PTS. So my edited command remedies that. However, there's a hitch. The amount of audio before the split point is not equal to the video before, so there will still be some silence at the joints. sboisse's method may be the safest. – Gyan Feb 29 '16 at 05:32
  • Unfortunately the audio pad still doesn't result in a sync. I even parsed the ffprobe for the closes keyframes to when I'm splitting the segments out and then added the pad above, still gets out of sync. I'm really wondering what else I can do. Just a couple clips into the combined concat video and the disconnect between video and audio is already 1/6th a second. – Xeoncross Mar 03 '16 at 22:05
  • Can you share enough files for me to reproduce the problem? – Gyan Mar 04 '16 at 05:28
  • I recorded a sample video, added the commands to chop it up, then concat it. http://davidpennington.me/share/audio_sync_test_video.zip I'll be happy to add another 200 point bounty if you can solve this. – Xeoncross Mar 05 '16 at 01:33
  • Will look into it. Only saw the above comment today. – Gyan Mar 13 '16 at 05:35
  • I just scanned the instructions.txt: the concat command is re-encoding the file. Was that meant? – Gyan Mar 13 '16 at 05:55
  • How much out of sync were you getting with your original method? – Gyan Mar 14 '16 at 08:26
  • I was trying both re-encoding and straight `-c copy` both causing sync issues. I'm not sure exactly how much out of sync the resulting files were after concat, but it's pretty noticeable just a couple seconds into the video. – Xeoncross Mar 14 '16 at 14:43
  • So, like a half a second out of sync (or more)? – Gyan Mar 14 '16 at 14:49
  • See my non-transcoded concat : http://www.datafilehost.com/d/2bcbf726. Looks in sync to me. Going by the text file, looks like you adapted my step 1 command to your segment generating command instead of feeding it the *generated* segment. See edited answer for a step 1 command to use as a segment-generating step. – Gyan Mar 14 '16 at 15:18
  • I had to make it a three-step process to sync the audio. First I cut the segment, then I padded it like you show, then I combined them all with concat and it seems to stay in sync. Thank you so much! – Xeoncross Mar 17 '16 at 23:01
  • Why? I was able to achieve a in-sync concat by using the new command I posted. Did you see my uploaded video? – Gyan Mar 18 '16 at 05:30
  • Yes, simply doing the slice + concat worked for that small video. I tried it for a longer video and noticed a very slight sync issue after about 1 minute of video. I checked with ffprobe and all I saw was `.009` offset start time for the video stream. I padded the movie (didn't seem to change much if anything from looking at the wavelengh) but the video synced better even after 1 minute of video and the padded video segments had a new offset of something like `0.042`. I think part of the problem is the player, OS platform, and file types that are also resulting in slightly different outputs. – Xeoncross Mar 18 '16 at 14:08
  • Mulvya, regarding your comment above, can you explain how 48Khz = 0.0213s? Wouldn't 48khz = 1/48000 = 0.0000208333s, therefore being fine enough so that ffmpeg can (start and) end the audio stream almost exactly when the video stream (starts and) ends? – Mark Schneider Aug 29 '16 at 21:13
  • 1024 samples per AAC audio frame & 48000 Hz = 48000 / 1024 = 46.875 frames/sec. So 1 frame = 0.0213 s. – Gyan Aug 29 '16 at 21:25
  • @Mulvya, hi, sir, i have tried your solution but in my case it didn't work, can you please have a look at the below so question? please? http://stackoverflow.com/questions/40860443/ffmpeg-f-concat-video-audio-sycn-issue – Zakir_SZH Nov 29 '16 at 07:57
  • @Gyan I am having a similar issue with mp4 files. Unfortunately, when I try with your step1, the -avoid_negative_ts make_zero gives me a incorrect codec parameters error. I've tried searching around to see what could be causing this but not much luck. If I replace the make_zero with a 1, it appears to actually go through but I have no idea what this is doing. – Richard Thomas Aug 06 '19 at 00:15
  • @RichardThomas open a new Q showing your *full* command and log. – Gyan Aug 06 '19 at 04:34
  • none of the suggestions here worked for me but then i found that the audio sampling rate was different between the files and once i make all the same concat worked perfectly and i had no audio issues. – Grimeire Oct 04 '21 at 14:49
11

I encountered a similar problem and found a solution that worked, at least for me. In my case, I was also concatenating files, and found audio/video sync problems with iOs, but not with Windows (e.g., VLC media player showed no synchronization problems using the same mp4 file). The symptom for iOs playing this concatenated mp4 was initially good synchronization followed by an increasing loss of synchronization as the movie played, with audio going faster than video. Interestingly, the sync could be restored temporarily by advancing the movie progress slider to any point in the movie, but then the sync would be lost again as the movie continued to play in iOs. By playing the same movie simultaneously in both iOs and Windows VLC, and initially synchronized with each other as well as I could, by observing the evolution of the "echo" between them, I concluded that the iOs audio was going too fast (assuming the Windows player is correct).

For me, the solution was to add the audio filter option -af aresample=async=1000 to the ffmpeg command, which I found as an example in the ffmpeg online documentation and used verbatim. I don't know if this setting is optimal, but the result was a mp4 with audio and video remaining synchronized when played by both iOs and VLC. This ffmpeg option yielded proper iOs synchronization both during concatenation and afterwards when re-encoding the already concatenated file.

Paul
  • 111
  • 1
  • 3
  • 1
    It's the only solution that worked for me. It requires to reencode the audio tho (error: `Filtergraph 'aresample=async=1000' was defined for audio output stream 0:1 but codec copy was selected. Filtering and streamcopy cannot be used together`) so I had to change the flag `-c copy` to `-c:v copy`. – GG. Feb 02 '22 at 20:23
  • I use this argument with the concat filter and it works. And I was like _But why?_ – ipid Feb 22 '22 at 05:11
  • This solves the issue of async indeed, besides is a way to sync video on top of audio? it may require re-encoding of video track but thats okay with me. – Radical Edward Aug 16 '23 at 10:05
3

you can use filter_complex to concat different options in one go

ffmpeg -i input1.mp4 -i input2.webm \
-filter_complex "[0:v:0] [0:a:0] [1:v:0] [1:a:0] concat=n=2:v=1:a=1 [v] [a]" \
-map "[v]" -map "[a]" <encoding options> output.mkv
Wang Liang
  • 4,244
  • 6
  • 22
  • 45
flower
  • 31
  • 2
  • 3
    Your command users a filter, and therefore will re-encode, but Xeoncross wants to avoid that. – llogan Feb 20 '16 at 01:00
3

I have been struggling with this one for quite some time as well. Particularly when working with Panasonic AVCHD-generated MTS files. My current solution is to concatenate them on the OS level not ffmpeg. I do this on windows and it looks something like this:

COPY /b input_1.mts + input_2.mts + input_3.mts output.mts

On linux it should be something like:

$ cat input_1.mts input_2.mts input_3.mts > output.mts

You can look up documentation for the windows and linux binary concatenation.

This method of concatenation as apposed to transcoding is the way to go if the original format will work for you. This method practically uses no CPU processing and preserves the original quality. A win-win when dealing with bulk media of high quality.

salmore
  • 135
  • 8
  • I would think this would fail with formats that have meta-data as the first X bytes of the file. Maybe not with all the safeguards built into media handling (like reading until the end of file regardless of what the stream data says?) – Xeoncross Feb 23 '16 at 18:43
  • This is a valid concern and should be taken into consideration when concatenating files on a binary level. – salmore Feb 23 '16 at 18:45
  • 2
    TS is a streaming container. This won't work with MP4 or MOVs..etc – Gyan Feb 23 '16 at 19:03
1

If the input videos have the same video format, audio format, dimensions, etc., you can use mkvmerge from mkvtoolnix to concatenate the videos without re-encoding:

mkvmerge -o output.mkv file1.mkv + file2.mkv + file3.mkv

mkvmerge also accepts input files with an MP4 container, but the output file will have an MKV container even if you try to specify the filename extension of the output file as .mp4. You can change the container with ffmpeg:

mkvmerge -o output.mkv file1.mp4 + file2.mp4 + file3.mp4
ffmpeg -i output.mkv -c copy output.mp4

I needed to concatenate videos from different sources that had been encoded with different settings, so I first used a command like this to resize and re-encode the input videos:

for f in *.mp4;do w=1280;h=720;ffmpeg -i $f -filter:v "scale=iw*min($w/iw\,$h/ih):ih*min($w/iw\,$h/ih),pad=$w:$h:($w-iw*min($w/iw\,$h/ih))/2:($h-ih*min($w/iw\,$h/ih))/2" -c:v libx264 -crf 22 -preset slow -pix_fmt yuv420p -c:a aac -q:a 1 -ac 2 -ar 44100 ${f%mp4}mkv;done

Some of my input videos didn't have an audio channel, so I used a command like this to add a silent audio channel to the videos:

for f in *.mkv;do ffprobe $f|&grep -q 1:\ Audio||{ ffmpeg -i $f -f lavfi -i anullsrc -c:a aac -shortest -c:v copy temp-$f;mv temp-$f $f;};done

I then concatenated the videos using mkvmerge:

mkvmerge -o output.mkv `printf %s\\n *.mkv|sed '1!s/^/+ /'`
nisetama
  • 7,764
  • 1
  • 34
  • 21
  • 1
    OP asks about ffmpeg. What's the point to offer other tool? Like you ask something about C# and someone answers about qBasic... – Alex Sham Dec 10 '20 at 19:32
  • 2
    After spending hours trying every FFmpeg suggestion/fix i could fine i gave up and tired this. It worked first time. Thanks a lot i wish i had tired it first. – Grimeire Oct 04 '21 at 14:26