Merging multiple video files with ffmpeg and xfade filter

Question

I need to merge multiple video files (with included audio) into a single video. I've noticed xfade has been recently released and used it but I am running into an audio sync issue.

All videos are in the same format / resolution / fame and bitrate / etc both for video and audio.

Here is what I am using to merge 5 videos of various durations with 0.5 crossfade transitions:

ffmpeg \
-i v0.mp4 \
-i v1.mp4 \
-i v2.mp4 \
-i v3.mp4 \
-i v4.mp4 \
-filter_complex \
"[0][1]xfade=transition=fade:duration=0.5:offset=3.5[V01]; \
 [V01][2]xfade=transition=fade:duration=0.5:offset=32.75[V02]; \
 [V02][3]xfade=transition=fade:duration=0.5:offset=67.75[V03]; \
 [V03][4]xfade=transition=fade:duration=0.5:offset=98.75[video]; \
 [0:a][1:a]acrossfade=d=0.5:c1=tri:c2=tri[A01]; \
 [A01][2:a]acrossfade=d=0.5:c1=tri:c2=tri[A02]; \
 [A02][3:a]acrossfade=d=0.5:c1=tri:c2=tri[A03]; \
 [A03][4:a]acrossfade=d=0.5:c1=tri:c2=tri[audio]" \
-vsync 0 -map "[video]" -map "[audio]" out.mp4

The code above generates a video with audio. The first and second segment is aligned with audio but starting with the second transition the sound is misaligned.

@llogan it's massive https://pastebin.com/SGnqB7Lt – tjk Aug 24 '20 at 19:56 — tjk, Aug 24 '20 at 19:56

llogan · Accepted Answer · 2021-08-24T22:38:43.917

25

Your offsets are incorrect. Try:

ffmpeg -i v0.mp4 -i v1.mp4 -i v2.mp4 -i v3.mp4 -i v4.mp4 -filter_complex \
"[0][1:v]xfade=transition=fade:duration=1:offset=3[vfade1]; \
 [vfade1][2:v]xfade=transition=fade:duration=1:offset=10[vfade2]; \
 [vfade2][3:v]xfade=transition=fade:duration=1:offset=21[vfade3]; \
 [vfade3][4:v]xfade=transition=fade:duration=1:offset=25,format=yuv420p; \
 [0:a][1:a]acrossfade=d=1[afade1]; \
 [afade1][2:a]acrossfade=d=1[afade2]; \
 [afade2][3:a]acrossfade=d=1[afade3]; \
 [afade3][4:a]acrossfade=d=1" \
-movflags +faststart out.mp4

How to get xfade offset values:

input	input duration	+	previous xfade `offset`	-	xfade `duration`	`offset` =
`v0.mp4`	4	+	0	-	1	3
`v1.mp4`	8	+	3	-	1	10
`v2.mp4`	12	+	10	-	1	21
`v3.mp4`	5	+	21	-	1	25

_{These are simplified example durations that are different than the durations shown in the original question.}

See xfade and acrossfade filter documentation for more info.
See FFmpeg Wiki: xfade for a gallery of transition effects and more examples.
You can get input durations with ffprobe.

edited Aug 24 '21 at 22:38

answered Aug 25 '20 at 00:24

llogan

121,796
28
232
243

You are right, figured it out today too with some tests. I also found a way to get a better sync by trimming/padding audio prior to these actions. This is an issue since most of the files' audio and video durations are not equal. So here is what I am doing now (that's another set of videos + your other suggestions): https://pastebin.com/QGcVZsk3 – tjk Aug 25 '20 at 00:58
Using the approach I got "More than 1000 frames duplicated ", and in the middle of the output there is out-of-sync between video and audio. Any ideas? – Richard Apr 29 '21 at 03:57
@Trương Quốc Khánh could you add xfade to ffmpegargs? – Spartan 117 Apr 05 '22 at 13:18
@Richard Did you ever solve the out-of-sync issue? I am facing the same issue right now. – user1734282 Dec 30 '22 at 18:15
@user1734282 can you try this? ffmpeg -i video0.mp4 -i video1.mp4 -i video2.mp4 -filter_complex "[0:v][1:v]xfade=transition=fade:duration=0.500:offset=41.567[v01]; [v01][2:v]xfade=transition=fade:duration=0.500:offset=55.534,format=yuv420p[video]; [0:a][1:a]acrossfade=d=0.500:c1=tri:c2=tri[a01]; [a01][2:a]acrossfade=d=0.500:c1=tri:c2=tri[audio]" -map [video] -map [audio] -movflags +faststart output.mp4 – Richard Jan 06 '23 at 03:25

score 5 · Answer 2 · answered Dec 28 '20 at 20:21

Automating the process will help deal with errors in calculating the offsets. I created a Python script that makes the calculation and builds a graph for any size list of input videos:

https://gist.github.com/royshil/369e175960718b5a03e40f279b131788

It will check the lengths of the video files (with ffprobe) to figure out the right offsets.

The crux of the matter is to build the filter graph and calculating the offsets:

# Prepare the filter graph
video_fades = ""
audio_fades = ""
last_fade_output = "0:v"
last_audio_output = "0:a"
video_length = 0
for i in range(len(segments) - 1):
    # Video graph: chain the xfade operator together
    video_length += file_lengths[i]
    next_fade_output = "v%d%d" % (i, i + 1)
    video_fades += "[%s][%d:v]xfade=duration=0.5:offset=%.3f[%s]; " % \
        (last_fade_output, i + 1, video_length - 1, next_fade_output)
    last_fade_output = next_fade_output

    # Audio graph:
    next_audio_output = "a%d%d" % (i, i + 1)
    audio_fades += "[%s][%d:a]acrossfade=d=1[%s]%s " % \
        (last_audio_output, i + 1, next_audio_output, ";" if (i+1) < len(segments)-1 else "")
    last_audio_output = next_audio_output

It may produce a filter graph such as

[0:v][1:v]xfade=duration=0.5:offset=42.511[v01]; 
[v01][2:v]xfade=duration=0.5:offset=908.517[v12]; 
[v12][3:v]xfade=duration=0.5:offset=1098.523[v23]; 
[v23][4:v]xfade=duration=0.5:offset=1234.523[v34]; 
[v34][5:v]xfade=duration=0.5:offset=2375.523[v45]; 
[v45][6:v]xfade=duration=0.5:offset=2472.526[v56]; 
[v56][7:v]xfade=duration=0.5:offset=2659.693[v67]; 
[0:a][1:a]acrossfade=d=1[a01]; 
[a01][2:a]acrossfade=d=1[a12]; 
[a12][3:a]acrossfade=d=1[a23]; 
[a23][4:a]acrossfade=d=1[a34]; 
[a34][5:a]acrossfade=d=1[a45]; 
[a45][6:a]acrossfade=d=1[a56]; 
[a56][7:a]acrossfade=d=1[a67]

I modified the code and hardcoded 3 file names in segments. I works very well for audio, however, the last video is not added. Rather the last frame of the middle video is shown until the music of last video finished — Mayank Kumar Chaudhari, Oct 03 '21 at 10:26

score 3 · Answer 3 · answered Mar 09 '21 at 12:14

The Python script above did help me a lot but it has a mistake in offset calculation. The video stream should be 'video_length - fade_duration*(i+1)'.

As the code below:

def gen_filter(segments):
    video_fades = ""
    audio_fades = ""
    settb = ""
    last_fade_output = "0:v"
    last_audio_output = "0:a"
    fade_duration = 0.3

    video_length = 0
    file_lengths = [0]*len(segments)
    
    for i in range(len(segments)):
        settb += "[%d]settb=AVTB[%d:v];" % (i,i)

    for i in range(len(segments)-1):

        file_lengths[i] = float(ffmpeg.probe(segments[i])['format']['duration'])

        video_length += file_lengths[i]
        next_fade_output = "v%d%d" % (i, i + 1)
        video_fades += "[%s][%d:v]xfade=transition=fade:duration=%f:offset=%f%s%s" % \
            (last_fade_output, i + 1, fade_duration, video_length - fade_duration*(i+1), '['+next_fade_output+'];' if (i) < len(segments)-2 else "","" if (i) < len(segments)-2 else ",format=yuv420p[video];")
        last_fade_output = next_fade_output

        next_audio_output = "a%d%d" % (i, i + 1)
        audio_fades += "[%s][%d:a]acrossfade=d=%f%s" % \
            (last_audio_output, i + 1, fade_duration*2, '['+next_audio_output+'];' if (i) < len(segments)-2 else "[audio]")
        last_audio_output = next_audio_output
        
    return settb + video_fades + audio_fades

score 0 · Answer 4 · edited Jan 26 '22 at 10:10

I wrote a similar but simpler script:

#!/bin/bash
# usage: ls -1 something*.mp4 | ffmpeg_xfade.sh output.mp4

fdur=0.5
ftrans=pixelize
f0n=0
f1n=1
alld=0

while read f; do
    allvf="$allvf$vf"
    allaf="$allaf$af"
    inputs="$inputs -i $f "
    d=$(ffprobe -v error -select_streams v:0 -show_entries stream=duration -of default=noprint_wrappers=1:nokey=1 "$f")
    alld=$(bc -l <<< "$alld + $d")
    offset=$(bc -l <<< "$alld - $fdur * $f1n")
    vf="[vfade$f0n][$f1n:v]xfade=transition=$ftrans:duration=$fdur:offset=$offset[vfade$f1n];"
    af="[afade$f0n][$f1n:a]acrossfade=d=$fdur[afade$f1n];"
    (( f0n++ ))
    (( f1n++ ))
done

f0n=$(( f0n - 1 ))
allvf="[0:v]copy[vfade0];$allvf[vfade$f0n]format=yuv420p"
allaf="[0:a]acopy[afade0];$allaf[afade$f0n]acopy"

#set -vx
ffmpeg -y -hide_banner $inputs \
    -filter_complex "$allvf;$allaf" \
    -c:v h264_nvenc -preset p7 -profile:v high -rc-lookahead 8 -spatial_aq 1 -pix_fmt yuv420p \
    -c:a libopus \
    "$1"

The resulting ffmpeg command for this resembles:

ffmpeg -y -hide_banner  -i f1.mp4  -i f2.mp4 -i f3.mp4  -i f4.mp4         
-filter_complex "[0:v]copy[vfade0];[vfade0][1:v]xfade=transition=pixelize:duration=0.5:offset=-.5[vfade1];
[vfade1][2:v]xfade=transition=pixelize:duration=0.5:offset=10.0[vfade2];
[vfade2][3:v]xfade=transition=pixelize:duration=0.5:offset=20.0[vfade3];
[vfade3]format=yuv420p;
[0:a]acopy[afade0];[afade0][1:a]acrossfade=d=0.5[afade1];
[afade1][2:a]acrossfade=d=0.5[afade2];
[afade2][3:a]acrossfade=d=0.5[afade3];
[afade3]acopy"       -c:v h264_nvenc -preset p7 -profile:v high -rc-lookahead 8 -spatial_aq 1 -pix_fmt yuv420p -c:a libopus "output.mp4"

Your answer could be improved by explaining more about what the code does and how it helps the OP. — Tyler2P, Nov 20 '21 at 09:22

score -1 · Answer 5 · answered Aug 22 '21 at 11:44

The script in this answer has a mistake, which was pointed out by this answer but presented as a new def.

If anyone just wants the corrected original script for the previous answer, replace the chunk after # Prepare the filter graph with:

# Prepare the filter graph
video_fades = ""
audio_fades = ""
last_fade_output = "0:v"
last_audio_output = "0:a"
video_length = 0
fade_duration = 0.5

for i in range(len(segments) - 1):
    # Video graph: chain the xfade operator together
    video_length += file_lengths[i]
    next_fade_output = "v%d%d" % (i, i + 1)
    video_fades += "[%s][%d:v]xfade=duration=0.5:offset=%.3f[%s]; " % \
        (last_fade_output, i + 1, video_length - fade_duration*(i+1), next_fade_output)
    last_fade_output = next_fade_output

    # Audio graph:
    next_audio_output = "a%d%d" % (i, i + 1)
    audio_fades += "[%s][%d:a]acrossfade=d=1[%s]%s " % \
        (last_audio_output, i + 1, next_audio_output, ";" if (i+1) < len(segments)-1 else "")
    last_audio_output = next_audio_output

# Assemble the FFMPEG command arguments
ffmpeg_args = ['ffmpeg',
               *itertools.chain(*files_input),
               '-filter_complex', video_fades + audio_fades,
               '-map', '[%s]' % last_fade_output,
               '-map', '[%s]' % last_audio_output,
               '-y',
               args.output_filename]

# Run FFMPEG
subprocess.run(ffmpeg_args)

it would be helpful to see the final underlying ffmpeg command for this. — user1432181, Jan 22 '22 at 14:29

Merging multiple video files with ffmpeg and xfade filter

5 Answers5

Linked