4

I have some videos either in mp4 or webm format, and I'd like to use ffmpeg to add 4 seconds to the start of each video to display some text in the center with no sound.

Some other requirements:

  • try to avoid re-encoding the video
  • need to maintain the quality (resolution, bitrate, etc)
  • (optional) to make the text fade in/out

I am new to ffmpeg and any help will be appreciated.

thanks in advance

Example ffprobe information for mp4 below:

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'input.mp4':
  Metadata:
    major_brand : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf55.33.100
  Duration: 00:00:03.84, start: 0.042667, bitrate: 1117 kb/s
Stream #0:0(eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1280x720, 1021 kb/s, 30 fps, 30 tbr, 15360 tbn, 60 tbc (default)
Metadata:
  handler_name    : VideoHandler
Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 140 kb/s (default)
Metadata:
  handler_name    : SoundHandler

Example webm

Input #0, matroska,webm, from 'input.webm':
  Metadata:
  encoder         : Lavf55.33.100
 Duration: 00:00:03.80, start: 0.000000, bitrate: 1060 kb/s
   Stream #0:0(eng): Video: vp8, yuv420p, 1280x720, SAR 1:1 DAR 16:9, 30 fps, 30 tbr, 1k tbn, 1k tbc (default)
   Stream #0:1(eng): Audio: vorbis, 48000 Hz, stereo, fltp (default)

Screenshot from joined.mp4

Screenshot for step 3 console

Ryan
  • 43
  • 1
  • 6

2 Answers2

6

You'll have to generate a 4 second video with dummy audio matching the parameters of the existing video, including timebase, and then use the concat demuxer with streamcopy.

For the sample files shown in Q:

Step 1 Generate text video

ffmpeg -f lavfi -r 30 -i color=black:1280x720 -f lavfi -i anullsrc -vf "drawtext=fontfile='/path/to/font.ttf':fontcolor=FFFFFF:fontsize=50:text='Your text':x='(main_w-text_w)/2':y='(main_h-text_h)/2',fade=t=in:st=0:d=1,fade=t=out:st=3:d=1" -c:v libx264 -b:v 1000k -pix_fmt yuv420p -video_track_timescale 15360 -c:a aac -ar 48000 -ac 2 -sample_fmt fltp -t 4 intro.mp4

For WebM, replace -c:v libx264 with -c:v libvpx, -c:a aac with -c:a libvorbis and intro.mp4 with intro.webm. You may remove the -video_track_timescale 15360 since WebMs tend to use a single timescale, that I've seen.

Step 2 Prepare concat file, say, list.txt

file 'intro.mp4'
file 'input.mp4'

Step 3 Concat

ffmpeg -f concat -i list.txt -c copy -fflags +genpts joined.mp4

The variables important here are video size 1280x720, frame rate -r 30, -pix_fmt yuv420p, sample rate -ar 48000, format -sample_fmt fltp, channel layout -ac 2 and of course, codecs.

Gyan
  • 85,394
  • 9
  • 169
  • 201
  • Thanks Mulvya, each video would have different resolution, bitrate, etc. Is it possible for ffmpeg to gather those information automatically? e.g. something like copy codecs – Ryan Feb 12 '16 at 09:39
  • Example : Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'input.mp4': Metadata: major_brand : isom minor_version : 512 compatible_brands: isomiso2avc1mp41 encoder : Lavf55.33.100 Duration: 00:00:03.84, start: 0.042667, bitrate: 1117 kb/s Stream #0:0(eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1280x720, 1021 kb/s, 30 fps, 30 tbr, 15360 tbn, 60 tbc (default) Metadata: handler_name : VideoHandler Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 140 kb/s (default) Metadata: handler_name : SoundHandler – Ryan Feb 12 '16 at 09:44
  • Not automatically, you will have to run `ffprobe input` and scrape that data from the readout. – Gyan Feb 12 '16 at 09:53
  • Thanks for the guide. Got the following error with step 1 so far: syntax error near unexpected token `(' – Ryan Feb 12 '16 at 11:24
  • Enclose the x and y expressions in ' This is a matter of escaping characters – Gyan Feb 12 '16 at 11:36
  • Fixed the syntax error, and the intro mp4 file successfully generated. I am also able to join the 2 videos, however the joined.mp4 seems to be blurred on the original video part. Is it because the bitrate of intro.mp4 is too low? (P.S. I tried with '-y -auto_convert 1' in the step 3 and no difference) – Ryan Feb 12 '16 at 12:03
  • Either use higher `-b:v` or switch to `-crf 18` – Gyan Feb 12 '16 at 12:08
  • Tried with the following 1) -b:v 1000k, bitrate of intro.mp4 is 130 kb/s; 2) -b:v 5000k, bitrate of intro.mp4 is 161 kb/s; 3) switch to -crf 18, bitrate of intro.mp4 is 65 kb/s. Also tried to increase -b:v more, and end result seems to be capped at 165 kb/s. – Ryan Feb 12 '16 at 12:35
  • It isn't capped. x264 optimizes bitrate. If you want to force a higher rate, use `-minrate 1000k` with `-b:v 1000k`. That said, can you post a screenshot of an intro frame? – Gyan Feb 12 '16 at 12:38
  • Yes, Added a screenshot from intro to the bottom of original question post (P.S. tried with minrate 1000k, no difference) – Ryan Feb 12 '16 at 12:54
  • You mean the edges? Increase font size - may help a bit but not much else to be done about that. Try a font with thicker strokes. – Gyan Feb 12 '16 at 13:07
  • Nope. the problem is with the joined.mp4 (attached another screenshot), I think the bitrate between intro.mp4 and original.mp4 made the joined.mp4 blurred. Do you think that was the reason? If so, is there a way to solve it within step 3? (tried auto_convert, but no luck) – Ryan Feb 12 '16 at 13:12
  • Nope, there's no re-encoding happening. Attach the console output of the concat command. – Gyan Feb 12 '16 at 13:26
  • Attached console screenshot. A big thanks for your help. I noticed the original file encoder is Lavf55.33.100 and the intro.mp4 encoder is Lavf57.24.101 in case that helps. – Ryan Feb 12 '16 at 14:00
  • it was done by ffmpeg, but I am unsure what command it was used. – Ryan Feb 12 '16 at 14:32
  • If you can, share the main file. I can check it later. – Gyan Feb 12 '16 at 14:38
  • Sure, can I email the file to you? – Ryan Feb 12 '16 at 14:45
  • There's no difference. I took the first frame from main and first frame of main in joined and did a difference blend in Photoshop. Result is pure black i.e. identical frames. – Gyan Feb 12 '16 at 15:34
  • You will see the difference from 5th seconds onwards in the joined. – Ryan Feb 12 '16 at 15:46
  • I just ran this cmd, and it's black after the title card : `ffmpeg -i joined.mp4 -i main.mp4 -an -filter_complex "[1:v]setpts=PTS+4/TB,setsar=1[1v];[0:v][1v]blend=difference,format=gray[v]" -map [v] diff.mp4` – Gyan Feb 12 '16 at 15:52
  • sorry, I don't understand the above command. my knowledge of ffmpeg is limited. :) – Ryan Feb 12 '16 at 16:17
  • It compares the frames in the two files (main and joined) after the 4th second, so if two frames are identical, the resulting output is black. If there are differences, there will be gray/white shapes. Experiment with the number before `/TB` to see what that looks like - it changes the timeline alignment of the two videos during the comparison process. – Gyan Feb 12 '16 at 16:25
  • hmm, the joined.mp4 looks blurred when I watch it in quickTime or in browser. :) attached the screenshot in original question post – Ryan Feb 12 '16 at 16:47
  • Right, I sorted the problem by adding refs 6 to step 1. thanks again Mulvya to help me. – Ryan Feb 12 '16 at 17:00
  • For step 1, is it possible to use a single image as background? Thanks in advance – Ryan Mar 13 '16 at 01:26
  • Sure. Replace `-f lavfi -r 30 -i color=black:1280x720` with `-loop 1 -r 30 -i image.png` – Gyan Mar 13 '16 at 04:30
  • Note that for multiple lines, see [here](https://stackoverflow.com/a/11139999/1074998). And for text position, see [here](https://superuser.com/a/939386/248836). E.g. `ffmpeg -f lavfi -r 30 -i color=black:1280x720 -f lavfi -i anullsrc -vf 'drawtext=fontsize=50:fontcolor=White:fontfile=':text='Line 1':x='(w-text_w)/2:y=(h-text_h)/2, drawtext=fontsize=50:fontcolor=White:fontfile=':text='Line 2':x='(w-text_w)/2:y=(h-text_h)/2+55, drawtext=fontsize=50:fontcolor=White:fontfile=':text='Line 3':x='(w-text_w)/2:y=(h-text_h)/2+110' -t 4 intro.mp4` – 林果皞 Jan 22 '21 at 17:01
0

Short answer is that you cannot encode new data as mp4 or webm and insert it at the front of the video stream. Those formats simply do not work like that. Both of these encoding formats are lossy, so if you decode and encode them again then additional information will be lost/changed by the second encoding. You could do something else, but what you are trying to do will not work.

MoDJ
  • 4,309
  • 2
  • 30
  • 65