I am trying to convert some audio files to a raw format for machine learning and I need to ensure that clipping does not occur because I do not want it to contaminate my dataset.
This question has been asked before, but unfortunately the solutions did not work for me.
Several options have been proposed to help mitigate this problem.
The --norm option does not appear to work very well ...
$ for name in *.au; do sox --norm ${name} -c 1 -r 44100 --bits 8 ${name}.mono-sr41000-ss8.raw; done
sox WARN dither: dither clipped 1 samples; decrease volume?
sox WARN dither: dither clipped 1 samples; decrease volume?
sox WARN dither: dither clipped 1 samples; decrease volume?
sox WARN dither: dither clipped 1 samples; decrease volume?
sox WARN dither: dither clipped 1 samples; decrease volume?
sox WARN dither: dither clipped 1 samples; decrease volume?
sox WARN dither: dither clipped 1 samples; decrease volume?
sox WARN dither: dither clipped 1 samples; decrease volume?
sox WARN dither: dither clipped 1 samples; decrease volume?
sox WARN dither: dither clipped 1 samples; decrease volume?
sox WARN dither: dither clipped 1 samples; decrease volume?
sox WARN dither: dither clipped 1 samples; decrease volume?
sox WARN dither: dither clipped 1 samples; decrease volume?
sox WARN dither: dither clipped 1 samples; decrease volume?
sox WARN dither: dither clipped 1 samples; decrease volume?
sox WARN dither: dither clipped 1 samples; decrease volume?
The -v 0.99 option does not appear to work very well:
$ for name in *.au; do sox -v 0.99 ${name} -c 1 -r 44100 --bits 8 ${name}.mono-sr41000-ss8.raw; done
sox WARN rate: rate clipped 11 samples; decrease volume?
sox WARN dither: dither clipped 8 samples; decrease volume?
sox WARN rate: rate clipped 1 samples; decrease volume?
sox WARN dither: dither clipped 1 samples; decrease volume?
sox WARN rate: rate clipped 12 samples; decrease volume?
sox WARN dither: dither clipped 9 samples; decrease volume?
sox WARN rate: rate clipped 6 samples; decrease volume?
sox WARN dither: dither clipped 5 samples; decrease volume?
sox WARN rate: rate clipped 10 samples; decrease volume?
sox WARN dither: dither clipped 11 samples; decrease volume?
sox WARN rate: rate clipped 3 samples; decrease volume?
sox WARN dither: dither clipped 3 samples; decrease volume?
sox WARN rate: rate clipped 2 samples; decrease volume?
sox WARN dither: dither clipped 2 samples; decrease volume?
ubuntu@ip-10-217-154-40:~/data/genre/genres/blues$ for name in *.au;
The -G option seems to work best. However, even with this option some clipping occurs.
$ for name in *.au; do sox -G ${name} -c 1 -r 41000 --bits 8 ${name}.mono-sr41000-ss8.raw; done
sox WARN dither: dither clipped 1 samples; decrease volume?
sox WARN dither: dither clipped 2 samples; decrease volume?
sox WARN dither: dither clipped 1 samples; decrease volume?
Combining -G and --norm did worse than using either individually ...
$ for name in *.au; do sox -G --norm ${name} -c 1 -r 44100 --bits 8 ${name}.mono-sr41000-ss8.raw; done
sox WARN dither: dither clipped 1 samples; decrease volume?
sox WARN dither: dither clipped 1 samples; decrease volume?
sox WARN dither: dither clipped 1 samples; decrease volume?
sox WARN dither: dither clipped 1 samples; decrease volume?
sox WARN dither: dither clipped 1 samples; decrease volume?
sox WARN dither: dither clipped 1 samples; decrease volume?
sox WARN dither: dither clipped 1 samples; decrease volume?
sox WARN dither: dither clipped 1 samples; decrease volume?
sox WARN dither: dither clipped 1 samples; decrease volume?
sox WARN dither: dither clipped 2 samples; decrease volume?
sox WARN dither: dither clipped 1 samples; decrease volume?
sox WARN dither: dither clipped 1 samples; decrease volume?
sox WARN dither: dither clipped 1 samples; decrease volume?
sox WARN dither: dither clipped 1 samples; decrease volume?
sox WARN dither: dither clipped 1 samples; decrease volume?
sox WARN dither: dither clipped 1 samples; decrease volume?
sox WARN dither: dither clipped 1 samples; decrease volume?
sox WARN dither: dither clipped 1 samples; decrease volume?
sox WARN dither: dither clipped 1 samples; decrease volume?
UPDATE: Further reducing the volume seems to eliminate the clipping.
Is this the best solution?
$ for name in *.au; do sox -G -v 0.95 ${name} -c 1 -r 44100 --bits 8 ${name}.mono-sr41000-ss8.raw; done
sox WARN dither: dither clipped 1 samples; decrease volume?
sox WARN dither: dither clipped 1 samples; decrease volume?
$ for name in *.au; do sox -G -v 0.9 ${name} -c 1 -r 44100 --bits 8 ${name}.mono-sr41000-ss8.raw; done