0

I am writing bit of code in C++ where I want to play a .wav file and perform an FFT (with fftw) on it as it comes (and eventually display that FFT on screen with ncurses). This is mainly just as a "for giggles/to see if I can" project, so I have no restrictions on what I can or can't use aside from wanting to try to keep the result fairly lightweight and cross-platform (I'm doing this on Linux for the moment). I'm also trying to do this "right" and not just hack it together.

I'm using SDL2_audio to achieve the playback, which is working fine. The callback is called at some interval requesting N bytes (seems to be desiredSamples*nChannels). My idea is that at the same time I'm copying the memory from my input buffer to SDL I might as well also copy it in to fftw3's input array to run an FFT on it. Then I can just set ncurses to refresh at whatever rate I'd like separate from the audio callback frequency and it'll just pull the most recent data from the output array.

The catch is that the input file is formatted where the channels are packed together. I.E "(LR) (LR) (LR) ...". So while SDL expects this, I need a way to just get one channel to send to FFTW.

The audio callback format from SDL looks like so:

void myAudioCallback(void* userdata, Uint8* stream, int len) {
    SDL_memset(stream, 0, sizeof(stream));
    SDL_memcpy(stream, audio_pos, len);
    audio_pos += len;
}

where userdata is (currently) unused, stream is the array that SDL wants filled, and len is the length of stream (I.E the number of bytes SDL is looking for).

As far as I know there's no way to get memcpy to just copy every other sample (read: Copy N bytes, skip M, copy N, etc). My current best idea is a brute-force for loop a la...

// pseudocode
for (int i=0; i<len/2; i++) {
    fftw_in[i] = audio_pos + 2*i*sizeof(sample)
}

or even more brute force by just reading the file a second time and only taking every other byte or something.

Is there another way to go about accomplishing this, or is one of these my best option? It feels kind of kludgey to go from a nice one line memcpy to send to the data to SDL to some sort of weird loop to send it to fftw.

fergu
  • 329
  • 1
  • 5
  • 12
  • (I am the same person as OP. Apparently clicking the 'login with GitHub' link didn't lead to the same place as my regular account even though I thought they were the same :P) – fergu Sep 23 '21 at 22:25
  • So, your question is basically how to read every other byte from a byte array? – Zakk Sep 23 '21 at 22:43
  • I don't think the loop is weird at all. It does what you want - stores every other byte from the audio. Memcpy is fast because it is able to storage large blocks of sequential data. You clearly can't do that if you want to skip bytes. Is this causing serious performance issues or are you simply wondering if the code looks good? – h0r53 Sep 23 '21 at 22:47
  • 1
    https://stackoverflow.com/questions/6013779 – genpfault Sep 23 '21 at 23:03
  • If all that you want to do with the data is compute the fft, then don't even have to copy your data. In fact fftw has support for doing multiple ffts on multiple packed channels. See [here](https://www.fftw.org/fftw3_doc/Advanced-Complex-DFTs.html). – user2407038 Sep 23 '21 at 23:10
  • @user2407038 - This is a perfect solution for my goal! It answers the question I meant rather than the one I asked :) – fergu Sep 24 '21 at 00:13
  • @h0r53 - No performance issues, nor do I think I'm even close to encountering them. It was more of a "is this the right way to do it" question. – fergu Sep 24 '21 at 00:14

2 Answers2

2

Very hard OP's solution can be simplified (for copying bytes):

// pseudocode
const char* s = audio_pos;
for (int d = 0; s < audio_pos + len; d++, s += 2*sizeof(sample)) {
    fftw_in[d] = *s;
}

If I new what fftw_in is, I would memcpy blocks sizeof(*fftw_in).

273K
  • 29,503
  • 10
  • 41
  • 64
  • It turns out that for my specific case `fftw` has built in support for what I'm trying to do, but this is the best answer for the question I actually asked so I'll mark it accepted. For anyone trying to repeat this with FFTW, see user2407038's answer on the OP – fergu Sep 24 '21 at 00:16
0

Please check assembly generated by @S.M.'s solution.

If the code is not vectorized, I would use intrinsics (depending on your hardware support) like _mm_mask_blend_epi8

Vlad Feinstein
  • 10,960
  • 1
  • 12
  • 27