I am resampling audio files with 8 kHz into 16 kHz by torchaudio.
An example of an original file:
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 8000 Hz, 1 channels, s16, 128 kb/s
After resampling it's become:
Stream #0:0: Audio: pcm_f32le ([3][0][0][0] / 0x0003), 16000 Hz, 1 channels, flt, 512 kb/s
So the precision has been changed to pcm_f32le.
I'd like to know if this is important for training of ASR systems or not.