I am having a hard time understanding how these audio files work. I load a WAV file like this:
from scipy.io import wavfile
RATE, data = wavfile.read('audio.wav')
This is a single channel (MONO) file (I think those are the same thing right?). It has shape (1555794,)
, and when I play it back it sounds reasonable:
p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paInt16, channels=1, rate=RATE, input=True, output=True)
stream.write(data)
stream.close()
p.terminate()
I have another file, which I created in Audacity, which is basically the same thing except I duplicated the first channel to make it two channel. When I load that in, the same way as above, I now have data_2channel
. When I try to play data_2channel
back with the same 5 lines as above, it playes but it's twice as slow and sounds like I am playing it as half speed. This isn't unexpected. To get it to play back normally I have to do this:
p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paInt16, channels=1, rate=RATE, input=True, output=True)
for d in data_2channel:
stream.write(d)
stream.close()
p.terminate()
This, I can swallow, but I don't quite understand. Maybe stream recognizing that when I try to write ndarray
s with 2 values and does something different than if I shove in a 1d array with 13000 values?
So then I thought, well let's convert data_2channel
back to one channel:
>>> data
array([-1552, -1635, -1361, ..., -2695, -3610, -2742], dtype=int16)
>>> data_2channel.T[0]
array([-1551, -1638, -1357, ..., -2700, -3606, -2744], dtype=int16)
So data_2channel.T[0]
looks pretty much the same as data
, but when I try to play the file I get this error:
File "../lib/python3.5/site-packages/pyaudio.py", line 586, in write
exception_on_underflow)
ValueError: ndarray is not C-contiguous
Not knowing what to think, I tried doing converting data
to python list and back to an ndarray, and tried playing it back:
p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paInt16, channels=1, rate=RATE, input=True, output=True)
stream.write(np.array(data.tolist()))
stream.close()
p.terminate()
It sounds similar to my file; like I slowed it down by 8 times, but still somehow kept the track the same length of time. I don't get why it sounds different. I tried plotting the two arrays and they look exactly the same. But somehow converting it to a list and back again, broke something. Can someone explain to me how this all works?
EDIT
Also it's not playing the entire file back to me. It's only playing about 50%. And I found that I can play the two channel file back by changing the channels
parameter to 2 (duh).
stream = p.open(format=pyaudio.paInt16, channels=2, rate=RATE, input=True, output=True)
stream.write(data_2channel)
However, the playback on this is cut to about 25% of the original. I'm baffled..