I'm trying to concatenate multiple short .mp4 video clips from a security camera. The camera records short clips, with a few seconds on either end of a timespan when motion is detected. For example, two minutes of video will often be broken up into four ~35 second clips, with the first/last few seconds of each clip being duplicative of the last/first few seconds of the previous/next clip.
I simply concatenate the clips together using the ffmpeg concat demuxer, as described here: How to concatenate two MP4 files using FFmpeg?, with
(echo file 'first file.mp4' & echo file 'second file.mp4' )>list.txt
ffmpeg -safe 0 -f concat -i list.txt -c copy output.mp4
Or else I transcode them into intermediate MPEG-2 transport streams, which I can then concatenate with the file-level concat protocol, as described here: https://trac.ffmpeg.org/wiki/Concatenate#protocol, with
ffmpeg -i "first file.mp4" -c copy -bsf:v h264_mp4toannexb -f mpegts intermediate1.ts
ffmpeg -i "second file.mp4" -c copy -bsf:v h264_mp4toannexb -f mpegts intermediate2.ts
ffmpeg -i "concat:intermediate1.ts|intermediate2.ts" -c copy -bsf:a aac_adtstoasc output.mp4
But either way, the resulting video (output.mp4) jumps backward in time a few seconds every half-minute or so because of the duplicated frames.
I want to throw out the duplicate frames, and tie the clips together based on timestamps to achieve smooth playback of the concatenated full-length video. I'd strongly prefer to do this on Windows with ffmpeg if possible. Surely this has been done before, right? Are there timestamps in the .mp4 files that I can use to determine how much overlap there is, and then splice at the proper point-in-time? And if so, how do I read them, how do I splice at an exact point in time, and how do I get around the KeyFrames issue if I can splice at the exact point in time?