1

I am facing a problem while working with audio files. I am implementing an algorithm that deals with audio files, and the algorithm requires the input to be a 5 KHz mono audio file.

Most of the audio files I have are PCM 44.1 KHz 16-bit stereo, so my problem is how to convert 44.1 KHz stereo files to 5 KHz mono files?

I would be grateful if anyone could provide a tutorial that explain the basics of DSP behind the idea or any JAVA libraries.

Samer Makary
  • 1,815
  • 2
  • 22
  • 25

3 Answers3

2

Just to augment what was already said by Prasad, you should low-pass filter the signal at 2.5 kHz before downsampling to prevent aliasing in the result. If there is some 4 kHz tone in the original signal, it can't possibly be represented by a 5 kHz sample rate, and will be folded back across the 2.5 kHz nyquist limit, creating a false ("aliased") tone at 1.5 kHz.

See related: How to implement low pass filter using java

Also, if you're downsampling from 44100 to 5000 hz, you'll be saving one for every 8.82 original samples; not a nice integer division. This means you should also employ some type of interpolation since you'll be sampling non-integer values from the original signal.

Community
  • 1
  • 1
Matt Montag
  • 7,105
  • 8
  • 41
  • 47
  • thanks a lot, this is what i ended up to after a loooot of reading in DSP and low pass filter ... u made me sure that am on the right track, and i think i will be using the Window-sinc filter and make convolution with its kernel i came across this: http://www.eetimes.com/design/programmable-logic/4017985/Designing-Digital-Filters – Samer Makary Aug 24 '11 at 19:11
1

Java Sound API (javax.sound.*) contains a lot of useful functions to manipulate sounds.

http://download.oracle.com/javase/tutorial/sound/index.html

You could find the already implemented java codes to easily down sample your audio file HERE.

Mohammad Najar
  • 2,009
  • 2
  • 21
  • 31
1

With the stereo PCM I have handled usually every other 16-bit value in the pcm bytearray is a data point corresponding to a particular stereo channel, this is called interleaving. So first grab every other value in the stereo channel to extract a mono PCM bytearray.

As for the frequency downsampling, if you were to play a 44100 Hz audio file as if it were a 5000hz audio file, you'll have too much data, which will make it sound slowed down. So take samples in increments of int(44100/5000) to downsample it to a 5khz signal.

Malz
  • 173
  • 7
  • thanks a lot for help, after some reading i got around the idea your idea ... can u explain with some more details ? – Samer Makary Aug 22 '11 at 21:54
  • You'll need to ask me more specific questions to get more specific answers, I'm not sure what to expand on sorry! – Malz Aug 22 '11 at 22:38
  • i mean the part of down-sampling. I read the file into an array of bytes which is the samples with sampling rate 44.1 Khz (which is the format of the file) so how to down-sample it to 5 Khz ? If you can explain to me steps or algorithm to make it clear – Samer Makary Aug 23 '11 at 21:35
  • So you have that bytearray. Every two bytes is one 16-bit pcm value. To get mono, read every other two bytes, or in other words every other short. So once you have that mono bytearray, read the shorts in increments of int(44100/5000) instead of every other short and you will downsample the frequency. – Malz Aug 24 '11 at 00:22
  • No, you won't. You will have to do low-pass filtering before that, which will prevent aliasing. – Erol Jun 08 '12 at 00:23