33

I need to automatically split video of a speech by words, so every word is a separate video file. Do you know any ways to do this?

My plan was to detect silent parts and use them as words separators. But i didn't find any tool to do this and looks like ffmpeg is not the right tool for that.

TermiT
  • 717
  • 1
  • 8
  • 15

1 Answers1

54

You could first use ffmpeg to detect intervals of silence, like this

ffmpeg -i "input.mov" -af silencedetect=noise=-30dB:d=0.5 -f null - 2> vol.txt

This will produce console output with readings that look like this:

[silencedetect @ 00000000004b02c0] silence_start: -0.0306667
[silencedetect @ 00000000004b02c0] silence_end: 1.42767 | silence_duration: 1.45833
[silencedetect @ 00000000004b02c0] silence_start: 2.21583
[silencedetect @ 00000000004b02c0] silence_end: 2.7585 | silence_duration: 0.542667
[silencedetect @ 00000000004b02c0] silence_start: 3.1315
[silencedetect @ 00000000004b02c0] silence_end: 5.21833 | silence_duration: 2.08683
[silencedetect @ 00000000004b02c0] silence_start: 5.3895
[silencedetect @ 00000000004b02c0] silence_end: 7.84883 | silence_duration: 2.45933
[silencedetect @ 00000000004b02c0] silence_start: 8.05117
[silencedetect @ 00000000004b02c0] silence_end: 10.0953 | silence_duration: 2.04417
[silencedetect @ 00000000004b02c0] silence_start: 10.4798
[silencedetect @ 00000000004b02c0] silence_end: 12.4387 | silence_duration: 1.95883
[silencedetect @ 00000000004b02c0] silence_start: 12.6837
[silencedetect @ 00000000004b02c0] silence_end: 14.5572 | silence_duration: 1.8735
[silencedetect @ 00000000004b02c0] silence_start: 14.9843
[silencedetect @ 00000000004b02c0] silence_end: 16.5165 | silence_duration: 1.53217

You then generate commands to split from each silence end to the next silence start. You will probably want to add some handles of, say, 250 ms, so the audio will have a duration of 250 ms * 2 more.

ffmpeg -ss <silence_end - 0.25> -t <next_silence_start - silence_end + 2 * 0.25> -i input.mov word-N.mov

(I have skipped specifying audio/video parameters)

You'll want to write a script to scrape the console log and generate a structured (maybe CSV) file with the timecodes - one pair on each line: silence_end and the next silence_start. And then another script to generate the commands with each pair of numbers.

Pau Coma Ramirez
  • 4,261
  • 1
  • 20
  • 19
Gyan
  • 85,394
  • 9
  • 169
  • 201
  • 20
    As a oneliner: `ffmpeg -i input.mkv -filter_complex "[0:a]silencedetect=n=-90dB:d=0.3[outa]" -map [outa] -f s16le -y /dev/null |& F='-aq 70 -v warning' perl -ne 'INIT { $ss=0; $se=0; } if (/silence_start: (\S+)/) { $ss=$1; $ctr+=1; printf "ffmpeg -nostdin -i input.mkv -ss %f -t %f $ENV{F} -y %03d.mkv\n", $se, ($ss-$se), $ctr; } if (/silence_end: (\S+)/) { $se=$1; } END { printf "ffmpeg -nostdin -i input.mkv -ss %f $ENV{F} -y %03d.mkv\n", $se, $ctr+1; }' | bash -x` – Vi. Jun 13 '16 at 14:28
  • 1
    This one liner doesn't work on mac. -bash: syntax error near unexpected token `&' – John Smith Sep 17 '16 at 21:09
  • 6
    @JohnSmith, Mac have old (pre-4) bash by default. Replace `|&` with `2>&1 |`. – Vi. Sep 26 '16 at 17:50
  • I am using "com.writingminds:FFmpegAndroid:0.3.2" can you help me to get list of silence – Rajesh Gauswami Oct 11 '17 at 11:14
  • 1
    @Vi.'s one-liner works perfectly, thanks! I now wonder if a) there is a way to ensure ffmpeg does not re-encode the pieces being produced this way, but just copies content to the pieces, b) what is the best way to put all the pieces back together, and c) how to automatically add perhaps an 0.2 seconds audio+video cross-dissolve between each piece, to make the result a bit more pleasant to the eye. This would make it the perfect script for editing video interviews! – giacecco May 31 '18 at 14:57
  • 3
    @giacecco To skip re-encoding add `-c copy` to the last ffmpeg command line. Other effects require more complicated script. Maybe I'll implement it and post as an answer someday... – Vi. Jun 01 '18 at 15:29
  • 2
    How can one adjust the noise parameters, `noise=-30dB:d=0.5` ? I have tried different values, but I am not getting `silent_start ` and `silent_end` pairs, that is, sometimes one is missing. – innuendo Feb 10 '19 at 15:41
  • 2
    @Vi. it seems you can earn 100 points by answering this question https://stackoverflow.com/questions/55057778/how-can-i-split-an-mp4-video-with-ffmpeg-every-time-the-volume-is-zero Please take a look. – Juan Pablo Fernandez Mar 11 '19 at 23:23
  • @JuanPabloFernandez, Thanks for the suggestion. – Vi. Mar 13 '19 at 14:19
  • `code` ffmpeg -i in.m4a -filter_complex "[0:a]silencedetect=n=-90dB:d=0.3[outa]" -map [outa] -f s16le -y /dev/null |& F='-aq 70 -v warning' perl -ne 'INIT { $ss=0; $se=0; } if (/silence_start: (\S+)/) { $ss=$1; $ctr+=1; printf "ffmpeg -nostdin -i in.m4a -c copy -ss %f -t %f $ENV{F} -y %03d.m4a\n", $se, ($ss-$se), $ctr; } if (/silence_end: (\S+)/) { $se=$1; } END { printf "ffmpeg -nostdin -i in.m4a -c copy -ss %f $ENV{F} -y %03d.m4a\n", $se, $ctr+1; }' | bash -x `code`. @giacecco, @Vi: I added -c copy 2x to avoid re-encoding. Needs only seconds, doesn't bloat the sizes of new files by ~4. – Marek Möhling Aug 06 '20 at 16:28
  • @giacecco, @ Vi: PS: I used this with a .m4a audio file (download from youtube+com/watch?v=eMqYq2VMOck). Neither your original script nor the edited new one works with the .mov or .mp4 video files I have. (e .g. youtube+com/watch?v=6zQP6vgWiek) – Marek Möhling Aug 06 '20 at 16:57
  • @Vi. LOL, wow, apparently I've been trying to solve this problem for 6 years now. Anyway, thanks for your reply back then, it got me closer, but this one-liner still fails on filenames with " -" in them. As all my video files have the RMS dB (a negative number) listed in the file name, the one-liner doesn't work on any of them. I tried putting the filename in single quotes, and also tried double, and tried escaping the minus signs with backslashes, and none of them solved the problem. – John Smith Feb 07 '22 at 01:19
  • 1
    @JohnSmith I have also published two non-oneliner versions: https://gist.github.com/vi/2fe3eb63383fcfdad7483ac7c97e9deb and https://gist.github.com/vi/2af29b9652a813ffe4b7e87c9a895f81. They may be more careful with filenames (no checked). – Vi. Feb 07 '22 at 11:56