Batch merge audio files by specific timestamp without reencoding

Question

I want to batch merge mp3 audio files where every single file has a specific start time. So below is the fluent-ffmpeg spawn I use right now to merge 3 files with each starting at respectively 200, 7400 and 10600.

ffmpeg -i firstFile.mp3 -i secondFile.mp3 -i thirdFile.mp3 -filter_complex 
[0]adelay=200[a0];[1]adelay=7400[a1];[2]adelay=10600[a2];[a0][a1] 
[a2]amix=inputs=3:dropout_transition=1000,volume=3 -f mp3 pipe:1

This works pretty good, except for longer files re-encoding makes the process take real long. So I wanted to do the same thing using concat demuxer. Since I already know how long each audio file is, I've put in silent audio files between them to create a delay until next audio file so it actually starts on the time position it is supposed to.

#concatfile.txt

file silence.mp3
outpoint 200
file firstFile.mp3
file silence.mp3
outpoint 1500
file secondFile.mp3
file silence.mp3
outpoint 2000
file thirdFile.mp3

ffmpeg -f concat -safe 0 -i concatfile.txt -c copy output.mp3

This solution also works okay when merging few files but when I merge higher count of files like 30 or 40 result file will have a slowly increasing synchronization problem where audio files actually start later than the start timestamps they are supposed to have.

Looks like an issue similar to this post

I'm open for any suggestion on solving the issue.

Sync problem with what? You only appear to have one stream in the output - do you mean speed or pitch changes? — Gyan, Aug 07 '19 at 18:00
an audio file that is supposed to start at lets say 0:30:25 starts at 0:30:26 and all other audios after that also have a delay, it is like on every concat operation there is an overflow of 0.02 seconds — Saccarab, Aug 07 '19 at 18:07
so I guess the question is does the demux concat have milliseconds precision. Because for instance, if I concatenate 2 files of 0:01:00 and 0:01:20 duration the result often is something like 0:02:22. — Saccarab, Aug 08 '19 at 08:37
Ok, this is not really an error, but a consequence of the audio codec. See https://stackoverflow.com/a/42415886. I haven't tested this but you could try adding `inpoint 0.026` for all of the non-silent MP3s. — Gyan, Aug 08 '19 at 10:12
I've tried adding 0.026 but that didn't help either. I've also tried the inpoint with a one liner ffmpeg to test out. I did only put 2 audio files on the text file and concatenated them but resulting audio duration was the same as If I didn't include inpoint at all. So I don't know maybe I'm doing something wrong with the inpoints. — Saccarab, Aug 08 '19 at 14:53
I've made the changes on the textfile with the `inpoint 0.026` but the output I get on that has the exact same duration when I didn't put it on the first place. — Saccarab, Aug 08 '19 at 15:00

Batch merge audio files by specific timestamp without reencoding

0 Answers0