My goal: To concat an image (foo.jpg) with a video (bar.mp4, 3 seconds long). Show foo.jpg for 2 seconds only. Output video should be just around 5 seconds long.
I used:
ffmpeg -loop 1 -t 2 -framerate 1 -i foo.jpg -f lavfi -t 2 -i anullsrc -i bar.mp4 -filter_complex "[0][1][2:v][2:a] concat=n=2:v=1:a=1 [vpre][a];[vpre]fps=24,scale=32:24[v]" -map "[v]" -map "[a]" out.mp4
I think the command means: Loop foo.jpg for 2 seconds at framerate of 1 frame per second. At the same time, add silent audio track to the 2-second foo.jpg video. Then concat with bar.mp4. Make final output framerate 24 fps. Scale it to 32x24 dimension (intentionally tiny for testing).
Expected: output to be about 5 seconds long in total.
Reality: output is 3 minutes and 38 seconds long. The first 5 seconds is perfect. After that, video just stays silent with the 5th-second frame frozen until the end.
My research shows that it might be related to -video_track_timescale
This command also fails with same over-long result (I added -video_track_timescale 600
):
ffmpeg -loop 1 -t 2 -framerate 1 -i foo.jpg -f lavfi -t 2 -i anullsrc -i bar.mp4 -filter_complex "[0][1][2:v][2:a] concat=n=2:v=1:a=1 [vpre][a];[vpre]fps=24,scale=32:24[v]" -map "[v]" -map "[a]" -video_track_timescale 600 out.mp4
Additional info about file bar.mp4
:
$ ffmpeg -i bar.mp4
ffmpeg version 4.3.1 Copyright (c) 2000-2020 the FFmpeg developers
built with Apple clang version 11.0.3 (clang-1103.0.32.62)
configuration: --prefix=/usr/local/Cellar/ffmpeg/4.3.1_1 --enable-shared --enable-pthreads --enable-version3 --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libbluray --enable-libdav1d --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librtmp --enable-libspeex --enable-libsoxr --enable-videotoolbox --disable-libjack --disable-indev=jack
libavutil 56. 51.100 / 56. 51.100
libavcodec 58. 91.100 / 58. 91.100
libavformat 58. 45.100 / 58. 45.100
libavdevice 58. 10.100 / 58. 10.100
libavfilter 7. 85.100 / 7. 85.100
libavresample 4. 0. 0 / 4. 0. 0
libswscale 5. 7.100 / 5. 7.100
libswresample 3. 7.100 / 3. 7.100
libpostproc 55. 7.100 / 55. 7.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'bar.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf58.29.100
Duration: 00:00:01.94, start: 0.000000, bitrate: 641 kb/s
Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, smpte170m/unknown/smpte170m), 480x360 [SAR 1:1 DAR 4:3], 354 kb/s, 24.58 fps, 24.58 tbr, 113734695.00 tbn, 49.16 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 280 kb/s (default)
Metadata:
handler_name : SoundHandler
At least one output file must be specified