0

I'm trying to get the frequencies from the array generated from pyAudio's callback().

def callback(in_data, frame_count, time_info, flag):    
    audio_data = np.fromstring(in_data, dtype=np.float32)
    freq_data = np.fft.fft(audio_data)
    freq = np.abs(freq_data)
    # Operations here
    recovered_signal = np.fft.ifft(filtered_freq).astype(np.float32).tostring()

I'm getting a 2048 length array, and am not sure how to proceed. I've narrowed down what operations I need to do and tried applying FFT to it, but realized that I need to unpack the data, and pyAudio's documentation is a little lacking (much less not even online sometimes).

Part of my problem is I'm not understanding what in_data is. From what I can tell from research, it's bytes, which numpy converts into an array for me. However, reading an article on signal-processing for python gave me the impression I should be able to extract this into frequencies, and then perform this on it for a basic passband filter.

  for f in freq:
        if index > LOWCUT and index < HIGHCUT:
            if f > 1:
                filtered_freq.append(f)
                #print(index)
            else:
                filtered_freq.append(0)
        else:
            filtered_freq.append(0)
        index += 1

I've looked at np.fft.fftfreq as well, but that also still seems to produce an array of 2048 length, instead of an array containing all the frequencies and their power.

Edit: I know that with two channels the are interweaved, but my issue is mostly not understanding what the converted array by numpy represents and can be used.

kckaiwei
  • 416
  • 7
  • 18
  • it seems to be interleaved, left and right channel, data. - https://stackoverflow.com/a/22644499/2823755 and http://portaudio.com/docs/v19-doxydocs/portaudio_8h.html#a8a60fb2a5ec9cbade3f54a9c978e2710 – wwii May 15 '18 at 23:01
  • I understand it's interleaved, but I don't know what to do with the 1048 length arrays :\. Not quite getting an understanding of what the array represents when converted by numpy. – kckaiwei May 16 '18 at 13:09
  • It really isn't clear what your asking. `in_data` is the audio data which appears to be in string format and in the function `audio_data` is a numpy array of the audio data. If you opened a WAV file it should be 48 kHz 16-bit two-channel data and once it is in a Numpy ndarray you can do whatever you want with it. The docs for `numpy.fft.fft` seem pretty clear and there are examples. Please read [mcve]. – wwii May 16 '18 at 14:41
  • I guess I'm asking how to process the audio_data. Mostly what do the numbers mean? I understand what fft does, but applying fft to what is to me an array of numbers without meaning does nothing. I guess I might be asking the wrong community, and should be targeting signal processing instead. Only turning here since I can't seem to dig up anyone explicitly explaining what the audio_data being returned is for someone who has never worked with sound before. – kckaiwei May 16 '18 at 17:43
  • Thinking about it more, is the array just an array of the intensity of the sound wave over 1024 samples? (2048 in this case). Reading this: http://pythonforengineers.com/audio-and-digital-signal-processingdsp-in-python/ Made me think I'd get more than just 1024(2048) after applying fft. – kckaiwei May 16 '18 at 17:59

0 Answers0