1

I'm writing a piano simulator where I continuously send buffers to the WASAPI API. I'm trying to do it in AUDCLNT_SHAREMODE_EXCLUSIVE mode but I still don't understand how to handle it.

With the code below, I instantiate a separate thread for each call to PlayBuf().

The problem is that after instantiating the first thread, if I try to instantiate a second one, the AUDCLNT_E_DEVICE_IN_USE message appears.

It is certainly my fault as I have not yet understood how to use wasapi in EXCLUSIVE mode.

Thanks

void PlayBuf(short *fileBytes, int fileSize)
{
    HRESULT hr;
    IMMDeviceEnumerator *deviceEnumerator = NULL;
    IMMDevice* audioDevice;
    IAudioClient2* audioClient;
    WAVEFORMATEX wfx = {};
    IAudioRenderClient* audioRenderClient;
    UINT32 bufferSizeInFrames;
    UINT32 bufferPadding;
    int16_t* buffer;
    
    CoInitialize(NULL);

    hr = CoCreateInstance(__uuidof(MMDeviceEnumerator),NULL,CLSCTX_ALL, __uuidof(IMMDeviceEnumerator),(LPVOID *)(&deviceEnumerator));
    assert (hr == S_OK);

    hr = deviceEnumerator->GetDefaultAudioEndpoint(eRender,eConsole,&audioDevice);
    assert(hr == S_OK);
    deviceEnumerator->Release();

    hr = audioDevice->Activate(__uuidof(IAudioClient2),CLSCTX_ALL,NULL,(LPVOID*)(&audioClient));
    assert(hr == S_OK);
    audioDevice->Release();

    wfx.wFormatTag = WAVE_FORMAT_PCM;
    wfx.nChannels = 2;
    wfx.nSamplesPerSec = 44100;
    wfx.wBitsPerSample = 16;
    wfx.nBlockAlign = (wfx.nChannels * wfx.wBitsPerSample) / 8;
    wfx.nAvgBytesPerSec = wfx.nSamplesPerSec * wfx.nBlockAlign;

    const int64_t REFTIMES_PER_SEC = 10000000;
    REFERENCE_TIME requestedSoundBufferDuration = REFTIMES_PER_SEC * DurataSuono;
    DWORD initStreamFlags = ( AUDCLNT_STREAMFLAGS_RATEADJUST);

    hr = audioClient->Initialize(AUDCLNT_SHAREMODE_EXCLUSIVE,initStreamFlags,requestedSoundBufferDuration,0, &wfx, NULL);
    assert(hr == S_OK);

    hr = audioClient->GetService(__uuidof(IAudioRenderClient),
    (LPVOID*)(&audioRenderClient));
    assert(hr == S_OK);

    hr = audioClient->GetBufferSize(&bufferSizeInFrames);
    assert(hr == S_OK);


    audioClient->Reset();
    hr = audioClient->Start();
    assert(hr == S_OK);

    hr = audioRenderClient->GetBuffer(fileSize, (BYTE**)(&buffer));
    assert(hr == S_OK);

    hr = audioRenderClient->ReleaseBuffer(fileSize, 0);
    assert(hr == S_OK);

    Sleep(2000);

    audioClient->Stop();
    audioClient->Release();
    audioRenderClient->Release();
}
double-beep
  • 5,031
  • 17
  • 33
  • 41
David
  • 39
  • 2
  • 6
  • 1
    [`AUDCLNT_E_DEVICE_IN_USE`](https://learn.microsoft.com/en-us/windows/win32/api/audioclient/nf-audioclient-iaudioclient-initialize) error is expected because you have already use the device in exclusive mode. [Typically, only a small number of "pro audio" or RTC applications require exclusive mode.](https://learn.microsoft.com/en-us/windows/win32/coreaudio/user-mode-audio-components) So could you share your use case for requiring exclusive mode? – Rita Han Jan 15 '21 at 02:03
  • so could it be a limitation of the sound card or driver? – David Jan 15 '21 at 10:20
  • No. Exclusive mode provide exclusive access. It is not shared. It is by design. – Rita Han Jan 18 '21 at 03:18
  • 2
    As noted by Rita Han, device-in-use is expected here. Just to clarify, exclusive mode means the device can only by used by one thread (accross all running processes!) at a time. But in audio, you typically want 1 audio processing thread at all times, so even with shared mode this design seems kind of sketchy. Why don't you just run 1 single dedicated audio thread? – Sjoerd van Kreel Jan 28 '21 at 18:14
  • already tried with only one thread. But with only one thread you have to wait for each sound to finish before processing a new sound. The sounds are heard in sequence. Schematically: PlayBuf (C); PlayBuf (E); PlayBuf (G); hear C, then E, then G in sequence. Instead, I need C, E, G to play together at the same time. Using SHAREMODE, you can call C, E, G one after the other and the sounds are mixed on the fly, basically you hear a G chord. The problem with SHAREMODE is that the latency is high. – David Jan 29 '21 at 19:19
  • 1
    You have to mix the audio data yourself. Even with shared mode this is the preferred way to go. Threads are expensive. How do you receive the incoming data? – Sjoerd van Kreel Jan 29 '21 at 22:59
  • I also tried to mix the incoming data but it is difficult to manage the flow. Example: I press C and play, while it is playing I press C, E, G at the same time: how do I interrupt the previous C? How do I know how many keys the user has pressed simultaneously to mix them and only then send them to play? I said to myself: if I use the wasapi engine mixer and many threads the problem becomes easier. However: the data comes from both the PC keyboard and a MIDI keyboard. – David Jan 30 '21 at 07:48
  • 1
    Don't submit audio directly to wasapi as a result of the user pressing the keyboard. You should have 1 ui thread which calculates note lengths and cutoffs etc from user input, then submit that info to the audio thread which turns it into actual sound, mixing notes etc, and only then submit that to wasapi. Maybe this https://stackoverflow.com/questions/26265575/playing-multiple-byte-arrays-simultaneously-in-java/26285895#26285895 could get you started, although it doesnt involve user input, it's about mixing multiple incoming audio signals on a single dedicated audio thread. – Sjoerd van Kreel Jan 30 '21 at 11:58
  • using only one thread the problem is to determine how many keys the user is pressing at the same time before sending the mixed buffer to wasapi – David Jan 30 '21 at 18:30

3 Answers3

1

I took an hour to whip up a basic sample for you. It's in C# using my own audio I/O library XT-Audio (so plug intended) but using raw wasapi in C++ it'd probably take me half a day. Anyway i believe this comes really close to what you're looking for. As you see below this app has the world's most awesome GUI:

Demo UI
(source: github.io)

As soon as you press start, the app starts translating keyboard input to audio. You can press & hold the c, d, e, f and g keyboard keys to generate musical notes. It handles multiple overlapping notes (chords), too. I chose to select wasapi shared mode as the backend because it supports floating point audio, but this will work just as well with exclusive mode if you translate the audio to 16-bit integer format.

A difference when working with this library vs raw wasapi, is that the audio thread is managed by the library and the application gets it's audio callback function invoked periodically to synthesize audio data. However this translates easily back to native wasapi using c++: just call IAudioRenderClient::GetBuffer/ReleaseBuffer in a loop on a background thread, and do your processing in between these calls.

Anyway the key part is this: this app only uses 2 threads, one for UI (managed by winforms), and one for audio (managed by the audio library), and yet it is capable of playing multiple musical notes simultaneously, which i believe is at the heart of your question.

I uploaded the full visual studio solution and binaries here: WasapiSynthSample but for completeness i'll post the interesting parts of the code below.

using System;
using System.Threading;
using System.Windows.Forms;
using Xt;

namespace WasapiSynthSample
{
    public partial class Program : Form
    {
        // sampling rate
        const int Rate = 48000;        
        // stereo
        const int Channels = 2;
        // default format for wasapi shared mode
        const XtSample Sample = XtSample.Float32;
        // C, D, E, F, G
        static readonly float[] NoteFrequencies = { 523.25f, 587.33f, 659.25f, 698.46f, 783.99f };

        [STAThread]
        static void Main()
        {
            // initialize audio library
            using (var platform = XtAudio.Init(null, IntPtr.Zero, null))
            {
                Application.EnableVisualStyles();
                Application.SetCompatibleTextRenderingDefault(false);
                Application.ThreadException += OnApplicationThreadException;
                AppDomain.CurrentDomain.UnhandledException += OnCurrentDomainUnhandledException;
                Application.Run(new Program(platform));
            }
        }

        // pop a messagebox on any error
        static void OnApplicationThreadException(object sender, ThreadExceptionEventArgs e)
        => OnError(e.Exception);
        static void OnCurrentDomainUnhandledException(object sender, UnhandledExceptionEventArgs e)
        => OnError((Exception)e.ExceptionObject);
        static void OnError(Exception e)
        {
            var text = e.ToString();
            if (e is XtException xte) text = XtAudio.GetErrorInfo(xte.GetError()).ToString();
            MessageBox.Show(text);
        }

        XtStream _stream;
        readonly XtPlatform _platform;

        // note phases
        readonly float[] _phases = new float[5];
        // tracks key down/up
        readonly bool[] _notesActive = new bool[5];

        public Program(XtPlatform platform)
        {
            _platform = platform;
            InitializeComponent();
        }

        // activate note
        protected override void OnKeyDown(KeyEventArgs e)
        {
            base.OnKeyDown(e);
            if (e.KeyCode == Keys.C) _notesActive[0] = true;
            if (e.KeyCode == Keys.D) _notesActive[1] = true;
            if (e.KeyCode == Keys.E) _notesActive[2] = true;
            if (e.KeyCode == Keys.F) _notesActive[3] = true;
            if (e.KeyCode == Keys.G) _notesActive[4] = true;
        }

        // deactive note
        protected override void OnKeyUp(KeyEventArgs e)
        {
            base.OnKeyUp(e);
            if (e.KeyCode == Keys.C) _notesActive[0] = false;
            if (e.KeyCode == Keys.D) _notesActive[1] = false;
            if (e.KeyCode == Keys.E) _notesActive[2] = false;
            if (e.KeyCode == Keys.F) _notesActive[3] = false;
            if (e.KeyCode == Keys.G) _notesActive[4] = false;
        }

        // stop stream
        void OnStop(object sender, EventArgs e)
        {
            _stream?.Stop();
            _stream?.Dispose();
            _stream = null;
            _start.Enabled = true;
            _stop.Enabled = false;
        }

        // start stream
        void OnStart(object sender, EventArgs e)
        {
            var service = _platform.GetService(XtSystem.WASAPI);
            var id = service.GetDefaultDeviceId(true);
            using (var device = service.OpenDevice(id))
            {
                var mix = new XtMix(Rate, Sample);
                var channels = new XtChannels(0, 0, Channels, 0);
                var format = new XtFormat(in mix, in channels);
                var buffer = device.GetBufferSize(in format).current;
                var streamParams = new XtStreamParams(true, OnBuffer, null, null);
                var deviceParams = new XtDeviceStreamParams(in streamParams, in format, buffer);
                _stream = device.OpenStream(in deviceParams, null);
                _stream.Start();
                _start.Enabled = false;
                _stop.Enabled = true;
            }
        }

        // this gets called on the audio thread by the audio library
        // but could just as well be your c++ code managing its own threads
        unsafe int OnBuffer(XtStream stream, in XtBuffer buffer, object user)
        {
            // process audio buffer of N frames
            for (int f = 0; f < buffer.frames; f++)
            {
                // compose current sample of all currently active notes
                float sample = 0.0f;
                for (int n = 0; n < NoteFrequencies.Length; n++)
                {
                    if (_notesActive[n])
                    {
                        _phases[n] += NoteFrequencies[n] / Rate;
                        if (_phases[n] >= 1.0f) _phases[n] = -1.0f;
                        float noteSample = (float)Math.Sin(2.0 * _phases[n] * Math.PI);
                        sample += noteSample / NoteFrequencies.Length;
                    }
                }

                // write current sample to output buffer
                for (int c = 0; c < Channels; c++)
                    ((float*)buffer.output)[f * Channels + c] = sample;
            }
            return 0;
        }
    }
}
Glorfindel
  • 21,988
  • 13
  • 81
  • 109
Sjoerd van Kreel
  • 1,000
  • 6
  • 19
  • really interesting your solution. If I understand correctly, in pseudocode: MyThread () { while (1) { AudioRenderClient :: GetBuffer / ReleaseBuffer } } – David Jan 31 '21 at 18:55
  • what i am not clear is, if you are using a global buffer, the same buffer that wasapi uses and you modify it with new data while wasapi is playing it (kind of a race condition). – David Jan 31 '21 at 18:59
  • 1
    That's what getbuffer/releasebuffer is for. In exclusive mode i think it locks half of a double buffer, in shared mode you have to call IAudioClient::GetCurrentPadding first.In any case, you lock part of a larger buffer, and the data you write will only be played by wasapi after you called releasebuffer. – Sjoerd van Kreel Feb 01 '21 at 08:10
  • this is my solution but it doesn't work properly while(1) { hr = audioRenderClient->GetBuffer(fileSize, (BYTE**)(&buffer)); assert(hr == S_OK); FillBufferWasapi(); hr = audioRenderClient->ReleaseBuffer(fileSize, 0); assert(hr == S_OK); } void FillBufferWasapi() { for(int ii=0;ii<255;ii++) { if(KeyDown[ii] == 1) // 1 = pressed { for(int i=0; i – David Feb 01 '21 at 09:45
  • 1
    Can you post a complete, compiling example ? – Sjoerd van Kreel Feb 01 '21 at 10:45
0

I joined FillBufferWasapi () to the thread code to have a clearer code for me that I don't have a lot of experience in real time applications, but I can't see the error

int wavPlaybackSample = 0;
int k=0;
while(flags != AUDCLNT_BUFFERFLAGS_SILENT)
{
    DWORD retval = WaitForSingleObject(hEvent, 2000);

    for(int ii=0;ii<255;ii++)
    {
        if(MyKeyDown[ii] == 1)
        {
            hr = audioRenderClient->GetBuffer(bufferSizeInFrames, (BYTE**)(&buffer));
            assert(hr == S_OK);

            for (UINT32 frameIndex = 0+k; frameIndex < bufferSizeInFrames+k; ++frameIndex)
            {
                *buffer++ = bfn[MyKeyCode[ii]][wavPlaybackSample++]; // left
                *buffer++ = bfn[MyKeyCode[ii]][wavPlaybackSample++]; // right
            }

            k+=bufferSizeInFrames;

            hr = audioRenderClient->ReleaseBuffer(bufferSizeInFrames, flags);
            assert(hr == S_OK);

            if(k >= MyBufferLength/4)
            {
                k=0;
                wavPlaybackSample=0;
            }
        }
    }

every time I press a key, I set the corresponding flag to 1 so that the buffer containing the samples is summed.

The difference between my version and yours which is a synthesizer, is that my version uses 88 preloaded buffers which contain sounds (wav) of a real piano.

int16_t* buffer;
int MyKeyDown[255];
int MyKeyCode[255];

short *fileBytes = new short[MyBufferLength];

void __fastcall TMyThread::Execute()
{
    HRESULT hr;
    int16_t* buffer;

    HANDLE hEvent = NULL;
    REFERENCE_TIME hnsRequestedDuration = 0;
    DWORD flags = 0;

    CoInitialize(NULL);
    //CoInitializeEx( NULL, COINIT_MULTITHREADED );

    IMMDeviceEnumerator *deviceEnumerator;
    hr = CoCreateInstance(__uuidof(MMDeviceEnumerator),NULL,CLSCTX_ALL, __uuidof(IMMDeviceEnumerator),(LPVOID *)(&deviceEnumerator));
    assert (hr == S_OK);

    IMMDevice* audioDevice;
    hr = deviceEnumerator->GetDefaultAudioEndpoint(eRender,eConsole,&audioDevice);
    assert(hr == S_OK);
    deviceEnumerator->Release();

    IAudioClient2* audioClient;
    hr = audioDevice->Activate(__uuidof(IAudioClient2),CLSCTX_ALL,NULL,(LPVOID*)(&audioClient));
    assert(hr == S_OK);
    audioDevice->Release();


    WAVEFORMATEX wfx = {};
    wfx.wFormatTag = WAVE_FORMAT_PCM;
    wfx.nChannels = 2;
    wfx.nSamplesPerSec = 44100;
    wfx.wBitsPerSample = 16;
    wfx.nBlockAlign = (wfx.nChannels * wfx.wBitsPerSample) / 8;
    wfx.nAvgBytesPerSec = wfx.nSamplesPerSec * wfx.nBlockAlign;

    hr = audioClient->GetDevicePeriod(NULL, &hnsRequestedDuration);
    assert(hr == S_OK);

    hr = audioClient->Initialize(AUDCLNT_SHAREMODE_EXCLUSIVE,
    AUDCLNT_STREAMFLAGS_EVENTCALLBACK,
    hnsRequestedDuration,
    hnsRequestedDuration,
    &wfx,
    NULL);

    // If the requested buffer size is not aligned...
    UINT32 nFrames = 0;
    if(hr == AUDCLNT_E_BUFFER_SIZE_NOT_ALIGNED)
    {
        // Get the next aligned frame.
        hr = audioClient->GetBufferSize(&nFrames);
        assert (hr == S_OK);

        hnsRequestedDuration = (REFERENCE_TIME)
        ((10000.0 * 1000 / wfx.nSamplesPerSec * nFrames) + 0.5);

        // Create a new audio client.
        hr = audioDevice->Activate(__uuidof(IAudioClient2),CLSCTX_ALL,NULL,(LPVOID*)(&audioClient));
        assert(hr == S_OK);

        // Open the stream and associate it with an audio session.
        hr = audioClient->Initialize(
        AUDCLNT_SHAREMODE_EXCLUSIVE,
        AUDCLNT_STREAMFLAGS_EVENTCALLBACK,
        hnsRequestedDuration,
        hnsRequestedDuration,
        &wfx,
        NULL);
        assert(hr == S_OK);
    }

    hEvent = CreateEvent(NULL, FALSE, FALSE, NULL);
    if (hEvent == NULL)
    {
        hr = E_FAIL;
        ShowMessage("CreateEvent fail!!!");
    }

    hr = audioClient->SetEventHandle(hEvent);
    assert(hr == S_OK);

    IAudioRenderClient *audioRenderClient;
    hr = audioClient->GetService(__uuidof(IAudioRenderClient),
    (LPVOID*)(&audioRenderClient));
    assert(hr == S_OK);

    UINT32 bufferSizeInFrames;
    hr = audioClient->GetBufferSize(&bufferSizeInFrames);
    assert(hr == S_OK);

    // from here play buffer
    hr = audioClient->Start();
    assert(hr == S_OK);

    int wavPlaybackSample = 0;

    while(flags != AUDCLNT_BUFFERFLAGS_SILENT)
    {
        DWORD retval = WaitForSingleObject(hEvent, 2000);

        UINT32 bufferPadding;
        hr = audioClient->GetCurrentPadding(&bufferPadding);
        assert(hr == S_OK);

        UINT32 soundBufferLatency = bufferSizeInFrames / 1;
        UINT32 numFramesToWrite = soundBufferLatency - bufferPadding;
        
        FillBufferWasapi();

        hr = audioRenderClient->GetBuffer(numFramesToWrite, (BYTE**)(&buffer));
        assert(hr == S_OK);


        for (UINT32 frameIndex = 0; frameIndex < numFramesToWrite; ++frameIndex)
        {
            *buffer++ = fileBytes[wavPlaybackSample]; // left
            *buffer++ = fileBytes[wavPlaybackSample]; // right

            ++wavPlaybackSample;
            //wavPlaybackSample %= fileSize;
        }
        hr = audioRenderClient->ReleaseBuffer(numFramesToWrite, flags);
        assert(hr == S_OK);

        //Sleep((DWORD)(hnsRequestedDuration/10000000));
    }

    audioClient->Stop();
    audioClient->Release();
    audioRenderClient->Release();

    CoUninitialize();
}
//---------------------------------------------------------------------------

void FillBufferWasapi()
{
    for(int ii=0;ii<255;ii++)
    {
        if(MyKeyDown[ii] == 1)
        {
            for(int i=0; i<MyBufferLength;i++)
            fileBytes[i]+=bfn[KeyCode[ii]][i];
        }
    }
}
//---------------------------------------------------------------------------

void __fastcall TForm1::AppMessage(MSG &Msg, bool &Handled)
{
    MyKeyCode['Z']=3; // C1
    MyKeyCode['X']=5; // D1
    MyKeyCode['C']=7; // E1
    
    switch (Msg.message)
    {
    case WM_KEYDOWN:
        
        if(MyKeyDown[Msg.wParam] == 0)
        {
            MyKeyDown[Msg.wParam] = 1;
        }
        break;

    case WM_KEYUP:
        if(MyKeyDown[Msg.wParam] == 1)
        {
            MyKeyDown[Msg.wParam] = 0;
        }
        break;
    }
}
David
  • 39
  • 2
  • 6
  • 1
    Ill take a look at it tonight, at work now. However are you sure this compiles and runs out of the box? For example i dont see MyKeyDown declared anywhere. – Sjoerd van Kreel Feb 01 '21 at 11:55
  • int MyKeyDown [255]; is a global integer that contains the ascii code of the key pressed. Example: MyKeyCode ['Z'] = 3; corresponds to the third buffer that contains the note in wav format MyKeyCode ['X'] = 5; corresponds to the fifth buffer that contains the note in wav format – David Feb 01 '21 at 12:03
  • 1
    Please post a minimal but complete compiling and runnable sample. Something we can just paste into visual studio and hit "play":) For example i don't see a main() or WinMain() function here. – Sjoerd van Kreel Feb 01 '21 at 13:29
  • Sorry but I use Embarcadero, not visual studio – David Feb 01 '21 at 14:25
  • 1
    Doesnt matter, just post a single file which will compile, link, and run with any c++ compiler. – Sjoerd van Kreel Feb 01 '21 at 14:56
  • unfortunately some features are part of the Embarcadero VCL library such as: keyboard message management and thread management. – David Feb 01 '21 at 16:14
  • 1
    Ok. 1) For exclusive mode you should pass bufferSizeInFrames to RenderClient::GetBuffer. 2) Using WASAPI in polling mode is sort-of deprecated, use event-driven mode instead, see AudioClient::Initialize. 3). wfx.nChannels = 2, but in FillBufferWasapi you treat the buffer as mono audio data. 4) Don't reset MyKeyDown in FillBufferWasapi, use WM_KEYUP in AppMessage instead. 5) Dont use AUDCLNT_STREAMFLAGS_RATEADJUST, thats only for shared mode. 6) FillBufferWasapi should use bufferSizeInFrames, too, instead of MyBufferLength. Thats about all i can say based on what you posted. – Sjoerd van Kreel Feb 01 '21 at 18:09
  • 1
    A runnable sample would really help here, any basic mfc or win32 app would suffice. As long as it's self-contained. That would require you to write it from scratch, of course. – Sjoerd van Kreel Feb 01 '21 at 18:10
  • I made some changes – David Feb 01 '21 at 18:30
  • 1
    You're gonna have to be a bit more specific. – Sjoerd van Kreel Feb 01 '21 at 19:26
  • I corrected the example with right channel and left channel. Some errors you have already reported to me are still present. – David Feb 02 '21 at 09:32
  • sorry but I didn't understand this: 2) Using WASAPI in polling mode is sort-of deprecated, use event-driven mode instead, see AudioClient :: Initialize. – David Feb 02 '21 at 10:13
  • 1
    You want to pass AUDCLNT_STREAMFLAGS_EVENTCALLBACK to AudioClient::Initialize and then use SetEventHandle to provide wasapi with an event object to notify your code when it has processed the previous buffer. That way on every iteration you know you can process exactly bufferSizeInFrames, and you dont have to query for GetCurrentPadding anymore. Also it turns your audio loop, which is now essentially a busy wait, into a blocking wait. Much more efficient, essentially wasapi tells you when it is ready instead of your code repeatedly asking "are you done yet?". – Sjoerd van Kreel Feb 02 '21 at 10:32
  • 1
    Also dont call FillBufferWasapi from the UI thread (in AppMessage). Audio renderer should run on 1 thread only. – Sjoerd van Kreel Feb 02 '21 at 10:40
  • thank you, I try to modify my program with these indications https://learn.microsoft.com/en-us/windows/win32/coreaudio/exclusive-mode-streams – David Feb 02 '21 at 10:55
  • rewritten but unstable and full of doubts and errors: help – David Feb 03 '21 at 15:20
  • 1
    I'd be glad to help you out but at this point you really, really need to provide a standalone application. Full source code (thats really EVERYTHING including gui, audio thread, main function etc etc) that builds, links and runs using the microsoft c++ compiler. It can be as minimal as you want, pretty much like the demo i posted, but it HAS TO BE self-contained. And yes, this means youre gonna need to write it from scratch, not using embarcadero (id install it but it looks like they only offer paid versions), and upload the full source code somewhere. Otherwise i'm out, sorry. – Sjoerd van Kreel Feb 03 '21 at 16:48
  • your help was really precious. I was just wondering, using events instead of polling, if the event should be generated by pressing the key and I think so. But then: if the events add up, how can a single thread be fast enough to respond to multiple events? – David Feb 03 '21 at 17:25
  • 1
    You don't raise the event, you wait for it. WASAPI raises it. – Sjoerd van Kreel Feb 03 '21 at 18:20
  • just for information: Embarcadero is free. Thanks for the advice and for your time. – David Feb 03 '21 at 18:33
  • 1
    Really? Ok if thats the case i'll help you out some more. But still, upload your full project somewhere otherwise i aint got nothing to go on. – Sjoerd van Kreel Feb 03 '21 at 18:57
  • yes. https://www.embarcadero.com/products/cbuilder/starter/free-download – David Feb 03 '21 at 20:14
  • "Also dont call FillBufferWasapi from the UI thread (in AppMessage). Audio renderer should run on 1 thread only." sorry if I seem to repeat myself, but I didn't understand where you think I should call FillBufferWasapi (), you tell me not in the UI and therefore, not after pressing a key, right? I always thought that it is the press of a key that triggers the event and consequently wasapi starts filling its buffer, right? – David Feb 04 '21 at 09:20
  • 1
    No. Wasapi triggers the event. You wait for it. Something like this: while(1) { WaitForEvent(); GetBuffer(); FillBufferWasapi(); ReleaseBuffer(); }.You never ever touch IAudioClient and friends from the ui thread except possibly for CreateInstance() or Release() (thats IUnknown::Release, not ReleaseBuffer). – Sjoerd van Kreel Feb 04 '21 at 13:43
  • OK thanks. Another thing that is not clear to me is the size of the buffer to send to wasapi. – David Feb 04 '21 at 15:06
  • I have 88200 * 5 second buffer that contain the note samples (C, D, E, F, G, ...) but, by calling the audioClient-> GetBufferSize (& bufferSizeInFrames); it returns me a much smaller value; do i have to divide the note buffer to send to wasapi into portions? Example for note C: buffersize = 88200 * 5 seconds = 441000 bytes. calling: audioClient-> GetBufferSize (& bufferSizeInFrames); it tells me: bufferSizeInFrames = 160 bytes; do i have to do 441000: 160 = 2756.25 send to wasapi with one cycle or can you send it at once? – David Feb 04 '21 at 15:08
  • 1
    Be careful, wasapi measures buffer sizes in frames, not bytes. But yes, for 1 second of 44100hz audio data, you must (getbuffer/processbuffer/releasebuffer) 44100/160 times. Where 160 is what is reported by AudioClient::GetBufferSize. So yeah, about 2756 "cycles" indeed. Note that this is only for exclusive-event driven mode, shared and polling work a bit different. – Sjoerd van Kreel Feb 04 '21 at 15:11
  • ah thanks, so 160 frames is 160 * numberOfChannels * sizeof (short) = 640 bytes? – David Feb 04 '21 at 15:38
  • where numberOfChannels = 2 – David Feb 04 '21 at 15:38
  • 1
    Most likely. If sizeof(short) == wfx.wBitsPerSample/8 – Sjoerd van Kreel Feb 04 '21 at 15:47
  • i modified part of the thread code but i don't know how to put the direct link to the modification – David Feb 04 '21 at 16:20
  • 1
    Just paste it as a comment. – Sjoerd van Kreel Feb 04 '21 at 16:28
  • as a comment it says "too long" – David Feb 04 '21 at 16:31
  • 1
    Again, cant do much without full source code. – Sjoerd van Kreel Feb 04 '21 at 17:01
  • samples are missing as they would take up too much space – David Feb 04 '21 at 17:12
  • 1
    Remove Sleep() from the audio loop. Thats handled by waiting for the event. Does it make any sound at all ? – Sjoerd van Kreel Feb 04 '21 at 17:13
  • it does not emit any sound, wav samples are needed – David Feb 04 '21 at 17:26
  • I use these in wav format http://theremin.music.uiowa.edu/MISpiano.html – David Feb 04 '21 at 17:37
  • 1
    i mean does it produce any soun at all, for you ? when the samples are loaded? – Sjoerd van Kreel Feb 04 '21 at 17:38
  • yes, but they are distorted and do not resemble the original sound of a piano. – David Feb 04 '21 at 17:42
  • 1
    and the samples you load are, in fact, 16-bit signed pcm format? – Sjoerd van Kreel Feb 04 '21 at 17:55
  • yes, it's as you say. sorry, just to avoid misunderstandings: in practice, instead of playing generated sounds like in your example, mine is a sample player – David Feb 04 '21 at 17:57
  • 1
    single-channel, 44.1 khz? – Sjoerd van Kreel Feb 04 '21 at 17:58
  • stereo 44100 Hz – David Feb 04 '21 at 18:00
  • 1
    Thats a stereo sample, but you treat it as mono. See *buffer++ = bfn[MyKeyCode[ii]][wavPlaybackSample]; // left *buffer++ = bfn[MyKeyCode[ii]][wavPlaybackSample]; // right. Also this dont seem right: if(k >= MyBufferLength/4) { ... } – Sjoerd van Kreel Feb 04 '21 at 18:08
  • if (k> = MyBufferLength / 4) {...} I put this because the AUDCLNT_BUFFERFLAGS_SILENT event never happens – David Feb 04 '21 at 18:34
  • 1
    Thats because its not an event. Its an indication by YOUR code to wasapi, to ignore the buffer contents. – Sjoerd van Kreel Feb 04 '21 at 18:42
  • i used Microsoft example https://learn.microsoft.com/en-us/windows/win32/coreaudio/exclusive-mode-streams – David Feb 04 '21 at 19:26
  • 1
    They set the flags to SILENT explicitly when out of data (i.e. at buffer end). See https://learn.microsoft.com/en-us/windows/win32/coreaudio/rendering-a-stream. – Sjoerd van Kreel Feb 04 '21 at 19:31
  • but it isn't wasapi setting the SILENT flag, right? – David Feb 04 '21 at 19:38
  • for (UINT32 frameIndex = 0+k; frameIndex < bufferSizeInFrames+k; ++frameIndex) { *buffer++ = bfn[MyKeyCode[ii]][wavPlaybackSample++]; // left *buffer++ = bfn[MyKeyCode[ii]][wavPlaybackSample++]; // right } – David Feb 04 '21 at 19:40
  • 1
    See IAudioClient::ReleaseBuffer. It can never set the flag as it's accepted by value not by pointer. Indeed it is the microsoft example code which sets the flag. – Sjoerd van Kreel Feb 04 '21 at 19:57
  • 1
    Yeah your last example seems about right. Still no decent sound? – Sjoerd van Kreel Feb 04 '21 at 19:58
  • the piano sound, sounds like a distorted sinth and never stops – David Feb 04 '21 at 20:03
  • I still don't understand who should set this AUDCLNT_BUFFERFLAGS_SILENT flag – David Feb 05 '21 at 16:01
  • 1
    Nobody does. It just tells wasapi to ignore the buffer (dont play it). – Sjoerd van Kreel Feb 05 '21 at 16:06
  • sorry, you wrote: "while (1) {WaitForEvent (); GetBuffer (); FillBufferWasapi (); ReleaseBuffer ();}" But what event should be expected? – David Feb 05 '21 at 17:27
  • anyway, I tried moving the FillBufferWasapi (); here: if (MyKeyDown [Msg.wParam] == 0) { MyKeyDown [Msg.wParam] = 1; FillBufferWasapi (); } and it seems to work better, but it doesn't stop anymore, it keeps looping. – David Feb 05 '21 at 17:33
  • 1
    See https://learn.microsoft.com/en-us/windows/win32/api/audioclient/nf-audioclient-iaudioclient-seteventhandle. Again, dont call FillBuffer from the UI thread. – Sjoerd van Kreel Feb 05 '21 at 22:41
  • thank you. I'm almost there. The tricky part is synchronizing the keyboard and wasapi thread events. How do the two synchronize? If the for loop in wasapi is loading a note buffer and a new one arrives and since the note buffer index is not at zero, information is lost. for (int frameIndex = 0 + k; frameIndex – David Feb 06 '21 at 20:09
  • 1
    Well theres multiple ways to do this. The simple one is like i did it, just check for note down in the fillbuffer function. Indeed this loses information in that note length always become multiple of the buffer size. A (the?) "correct" way to do it is to have a lock-free queue shared between the 2 threads, which the gui posts note down/up messages tagged with timestamps to (a bit like midi messages if you will), and the audio thread then reads these messages and correlates the timestamps to buffer positions. This is a bit more involved but entirely possibly. Also see IAudioClock(2) interface. – Sjoerd van Kreel Feb 07 '21 at 11:43
  • a circular buffer, could it be a lossless solution? – David Feb 07 '21 at 11:58
  • 1
    Yes. It is guaranteed to be lossless if your audio processing callback is fast enough. If its not fast enough, an unbounded queue wouldn't help either, because then the lag between pressing a key and hearing the note becomes progressively larger the longer you run. And besides, you really don't want to do heap allocations during low-latency audio processing. – Sjoerd van Kreel Feb 07 '21 at 12:20
  • what leaves me doubtful using a shared queue between two threads (lock-free) is that: while wasapi is using the queue for play but also for filling its buffer, the GUI through events and fillbuffer renews its buffer, anyway temporal (like MIDI): but couldn't there be overlaps and data loss? – David Feb 07 '21 at 19:04
  • "Using the queue for play but also for filling its buffer": thats the same thing. "GUI through events and fillbuffer..." The gui never calls FillBufferWasapi. I assume you mean "gui fills the shared queue"?. Also it does not "renew" the shared queue, it pushes to the back of it, while the audio thread reads from the front of it. Thats why its important that your audio thread is fast enough, to prevent overflowing the shared queue (since it is a data structure with bounded size). – Sjoerd van Kreel Feb 08 '21 at 09:01
  • thank you. I am still studying the producer / consumer problem. I don't like the circular buffer very much. I would like to find a system that doesn't oblige to synchronize threads. Perhaps instantiating dynamic buffers of 160 frames each? short * buffer [1000]; short buffer [0] = new short [160]; short buffer [1] = new short [160]; short buffer [2] = new short [160]; So when the consumer has something to do, he is left with his frame queue. If the queue is empty, it waits. – David Feb 11 '21 at 17:08
  • 1
    What's wrong with a ringbuffer? It's not very different from your solution i think, apart from an extra level of indirection? In any case, whatever data structure you use to communicate between the 2 threads, be sure to pre-allocate all of it (dont do dynamic allocations while the audio is running) and don't use any operating-system level locking (critical_section & friends). – Sjoerd van Kreel Feb 11 '21 at 19:54
  • thank you. What escapes me is if you usually use to clear the buffer: memset (fileBytes, 0, MyBufferLength); – David Feb 14 '21 at 19:29
  • 1
    Isn't the buffer containing prerecorded sample data in your case ? – Sjoerd van Kreel Feb 14 '21 at 21:48
  • fileBytes is the pass buffer. I have: buf [0], buf [1], .... buf [88]; sample buffers that contain notes. fileBytes is the buffer that contains the chords I pass to wasapi. For example, if I press a key and never reset fileBytes, a loop is created. I had also thought about resetting the bytes as they are consumed by wasapi but it doesn't work very well, a sort of generate / take. – David Feb 15 '21 at 07:44
  • 1
    It looks like filebytes contains audio data, right? If so, don't do that. Have the ui thread push note down/up or possible even full ADSR info if your working with an external keyboard. Only translate stuff from "midi-like" info into actual audio when on the audio thread. – Sjoerd van Kreel Feb 15 '21 at 22:47
  • that's right, filebytes contains the sound data. The problem is to fill this with single notes but also chords. Once filebytes is filled, it is poured into the wasapi buffer: but when you reset filebytes, never? – David Feb 16 '21 at 13:16
  • 1
    The way i'd do it is to have 88 pre-allocated read-only buffers containing the piano notes, have the gui thread sent note up/down messages tagged with timestamps to the audio thread, and on the audio thread, read from the note buffers at sample positions derived from those timestamps and the wasapi IAudioClock info, then mix them together into wasapi's buffer. No need to reset anything. Although you probably want some sort of release envelope to prevent notes from being cut off to harsh at key up. – Sjoerd van Kreel Feb 16 '21 at 16:16
0

This should 99% fit your needs, it's a sample player in pure c++ using wasapi.

To compile and link:

  • Needs c++17(+) conforming compiler
  • Install boost library, used for lock-free queue
  • Probably needs the MS c++ compiler (uses conio.h)
  • Refer avrt.lib for real-time audio thread (uses AvSetMmThreadPriority)
  • In case you need it, full vs2019 project

To run:

  • You need 5 .wav files in 44100 16bit stereo format, called c4.wav to g4.wav.
  • See SamplePack

What it does:

  • Console app runs a getchar() loop, c, d, e, f, g, triggers note-on, q quits
  • Since it is a console app, no note-off messages. Each keypress triggers a playback of the full sample.
  • Note-down gets tagged with timestamp and posted to a shared queue (this is the boost lock-free thing, capped at size 64).
  • So, you can crash it by pressing more than 64 keys in a 3-millisecond interval (minimum wasapi exclusive latency).
  • Audio thread picks up these messages, and puts them in an "active notes" list that's local to the audio thread. Active notes is bounded by the maximum polyphony (64).
  • So, you can also crash it by pressing more than 64 keys within [length of the shortest sample] seconds.
  • Mix each active note into the current wasapi buffer, untill it reaches the end of the .wav sample.

Here's the code:

#include <atomic>
#include <vector>
#include <cstdio>
#include <cstdint>
#include <cassert>
#include <fstream>
#include <cstring>
#include <iostream>
#include <filesystem>
#include <boost/lockfree/queue.hpp>

#include <conio.h>
#include <atlbase.h>
#include <Windows.h>
#include <avrt.h>
#include <mmdeviceapi.h>
#include <Audioclient.h>

// for wasapi event callback
static HANDLE event_handle;

// sample data
static const size_t sample_count = 5;
static int16_t* note_samples[sample_count];
static size_t note_frame_counts[sample_count];
static std::vector<char> note_samples_raw[sample_count];
static char const* note_files[sample_count] = { 
  "c4.wav", "d4.wav", "e4.wav", "f4.wav", "g4.wav"
};

// user input / audio thread communication
static std::atomic_bool stop_finished;
static std::atomic_bool stop_initiated;

// scale mix volume
static const double mix_scale_amp = 0.4;

// debug stuff
static int32_t prev_note_active_count = 0;
static int32_t prev_note_audible_count = 0;

// timing stuff
static const int64_t millis_per_second = 1000;
static const int64_t reftimes_per_milli = 10000;

// audio format = 44.1khz 16bit stereo
static const int32_t sample_size = 2;
static const int32_t channel_count = 2;
static const int32_t sample_rate = 44100;
static const int32_t frame_size = sample_size * channel_count;

// exclusive mode event driven must use 128-byte aligned buffers
static const int32_t alignment_requirement_bytes = 128;

// note down notification + timestamp
static const size_t note_queue_size = 64;
struct note_down_msg
{
  int32_t note; // 0..4 = c..g
  uint64_t time_stamp_qpc;
};
static boost::lockfree::queue<note_down_msg> 
note_msg_queue(note_queue_size);

// current playing notes
static const size_t max_polyphony = 64;
struct active_note
{
  // slot in use?
  bool in_use;
  // note + timestamp
  note_down_msg msg;
  // position relative to stream pos when it should start
  uint64_t trigger_pos_frames;
  // how many of it has played already
  size_t frames_rendered;
  active_note() = default;
};
static active_note 
active_notes[max_polyphony];

// shared by user input / audio thread
struct audio_thread_data
{
  IAudioClock* clock;
  IAudioClient* client;
  IAudioRenderClient* render;
};

// bail out on any error
#define CHECK_COM(expr) do {                \
  HRESULT hr = expr;                        \
  if(SUCCEEDED(hr)) break;                  \
  std::cout << #expr << ": " << hr << "\n"; \
  std::terminate();                         \
} while(0)

static WAVEFORMATEXTENSIBLE
make_audio_format()
{
  // translate format specification to WAVEFORMATEXTENSIBLE
  WAVEFORMATEXTENSIBLE result = { 0 };
  result.dwChannelMask = 0;
  result.SubFormat = KSDATAFORMAT_SUBTYPE_PCM;
  result.Samples.wValidBitsPerSample = sample_size * 8;
  result.Format.nChannels = channel_count;
  result.Format.nSamplesPerSec = sample_rate;
  result.Format.wBitsPerSample = sample_size * 8;
  result.Format.wFormatTag = WAVE_FORMAT_EXTENSIBLE;
  result.Format.cbSize = sizeof(WAVEFORMATEXTENSIBLE);
  result.Format.nBlockAlign = channel_count * sample_size;
  result.Format.nAvgBytesPerSec = channel_count * sample_size * sample_rate;
  return result;
}

static void
load_note_samples()
{
  for(size_t i = 0; i < sample_count; i++)
  {
    // load piano samples to bytes
    auto path = std::filesystem::current_path() / note_files[i];
    std::ifstream input(path, std::ios::binary);
    assert(input);
    input.seekg(0, input.end);
    size_t length = input.tellg();
    input.seekg(0, input.beg);
    note_samples_raw[i].reserve(length);    
    input.read(note_samples_raw[i].data(), length);
    assert(input);
    input.close();

    // compute frame count and set actual audio data
    // 44 bytes skipped for .WAV file header
    note_frame_counts[i] = (length - 44) / (sample_size * channel_count);
    note_samples[i] = reinterpret_cast<int16_t*>(note_samples_raw[i].data() + 44);
  }
}

// this runs audio processing
static DWORD WINAPI
run_audio_thread(void* param)
{
  int16_t* audio;
  BYTE* audio_mem;
  bool slot_found;
  UINT32 buffer_frames;

  HANDLE task;
  BOOL success;
  DWORD wait_result;
  DWORD task_index = 0;

  UINT64 clock_pos;
  UINT64 clock_freq;
  UINT64 clock_qpc_pos;
  LARGE_INTEGER qpc_freq;

  audio_thread_data* data = static_cast<audio_thread_data*>(param);

  // init thread
  CHECK_COM(CoInitializeEx(nullptr, COINIT_APARTMENTTHREADED));
  task = AvSetMmThreadCharacteristicsW(TEXT("Pro Audio"), &task_index);
  assert(task != nullptr);

  // wasapi buffer frame count & clock info
  CHECK_COM(data->client->GetBufferSize(&buffer_frames));
  CHECK_COM(data->clock->GetFrequency(&clock_freq));
  success = QueryPerformanceFrequency(&qpc_freq);
  assert(success);

  // audio loop
  data->client->Start();
  while(!stop_initiated.load())
  {
    wait_result = WaitForSingleObject(event_handle, INFINITE);
    assert(wait_result == WAIT_OBJECT_0);

    // retrieve and clear buffer for this round
    CHECK_COM(data->render->GetBuffer(buffer_frames, &audio_mem));
    audio = reinterpret_cast<int16_t*>(audio_mem);
    memset(audio, 0, buffer_frames * static_cast<uint64_t>(frame_size));
    
    // get timing stuff
    CHECK_COM(data->clock->GetPosition(&clock_pos, &clock_qpc_pos));        
    uint64_t stream_offset_hns = clock_pos * reftimes_per_milli * millis_per_second / clock_freq;
    uint64_t stream_offset_frames = stream_offset_hns * sample_rate / (reftimes_per_milli * millis_per_second);

    // process each frame
    for(size_t f = 0; f < buffer_frames; f++)
    {
      // pop user input, find empty slot in active notes buffer
      // for better performance this can also be outside the frame
      // loop, at start of each buffer round, in that case add 1 additional buffer latency
      note_down_msg msg;
      while(note_msg_queue.pop(msg))
      {
        slot_found = false;
        for(size_t i = 0; i < max_polyphony; i++)
          if(!active_notes[i].in_use) 
          {
            slot_found = true;
            active_notes[i].msg = msg;
            active_notes[i].in_use = true;
            active_notes[i].frames_rendered = 0;
            int64_t clock_note_diff_qpc = clock_qpc_pos - static_cast<int64_t>(active_notes[i].msg.time_stamp_qpc);
            int64_t clock_note_diff_hns = clock_note_diff_qpc * reftimes_per_milli * millis_per_second / qpc_freq.QuadPart;
            int64_t clock_note_diff_frames = clock_note_diff_hns * sample_rate / (reftimes_per_milli * millis_per_second);
            int64_t note_clock_diff_frames = -static_cast<int64_t>(clock_note_diff_frames);
            // allow 1 buffer latency otherwise notes would have to start in the past
            active_notes[i].trigger_pos_frames = stream_offset_frames + note_clock_diff_frames + buffer_frames;
            assert(active_notes[i].trigger_pos_frames <= stream_offset_frames + buffer_frames * 3);
            assert(active_notes[i].trigger_pos_frames >= stream_offset_frames + f);
            break;
          }
        if(!slot_found)       
          assert(!"Max polyphony reached.");
      }
    
      // debugging stuff
      int32_t note_active_count = 0;
      int32_t note_audible_count = 0;      

      // compose frame from all samples active up to max_polyphony
      double current_samples[channel_count] = { 0 };
      for(size_t i = 0; i < max_polyphony; i++)
      {
        // slot not in use
        if(!active_notes[i].in_use) continue;
        note_active_count++;

        // not my turn yet
        // note this very briefly wastes a slot for a sample which starts halfway in the current buffer
        if(active_notes[i].trigger_pos_frames > stream_offset_frames + f) continue;

        if(active_notes[i].frames_rendered == note_frame_counts[active_notes[i].msg.note])
        {
          // reached sample end
          active_notes[i].in_use = false;
          active_notes[i].frames_rendered = 0;
          continue;
        }

        // note is active + audible
        note_audible_count++;
        size_t frame_index = active_notes[i].frames_rendered++;
        for(size_t c = 0; c < channel_count; c++)
        {
          assert(active_notes[i].msg.note < sample_count);
          assert(frame_index < note_frame_counts[active_notes[i].msg.note]);
          current_samples[c] += static_cast<double>(note_samples[active_notes[i].msg.note][frame_index * channel_count + c] * mix_scale_amp) / SHRT_MAX;        
        }
      }

      // normally never do io on the audio thread, just debugging
      if(prev_note_active_count != note_active_count || prev_note_audible_count != note_audible_count)
        ;//std::cout << "\nactive: " << note_active_count << " audible: " << note_audible_count << "\n";
      prev_note_active_count = note_active_count;
      prev_note_audible_count = note_audible_count;

      // convert to int16 and write to wasapi
      for(size_t c = 0; c < channel_count; c++)
        audio[f * channel_count + c] = static_cast<int16_t>(current_samples[c] * SHRT_MAX);
    }

    CHECK_COM(data->render->ReleaseBuffer(buffer_frames, 0));
  }
  data->client->Stop();

  // cleanup
  success = AvRevertMmThreadCharacteristics(task);
  assert(success);
  CoUninitialize();
  stop_finished.store(true);
  return 0;
}

// this runs user input
static void
run_user_input_thread()
{
  int32_t chr;
  int32_t note;
  BOOL success;
  UINT32 buffer_frames;
  REFERENCE_TIME engine;
  REFERENCE_TIME period;
  LARGE_INTEGER qpc_count;
  CComPtr<IMMDevice> device;
  CComPtr<IAudioClock> clock;
  CComPtr<IAudioClient> client;
  CComPtr<IAudioRenderClient> render;
  CComPtr<IMMDeviceEnumerator> enumerator;  
  WAVEFORMATEXTENSIBLE format = make_audio_format();

  // get default render endpoint
  CHECK_COM(CoCreateInstance(__uuidof(MMDeviceEnumerator), nullptr, CLSCTX_ALL, 
    __uuidof(IMMDeviceEnumerator), reinterpret_cast<void**>(&enumerator)));
  CHECK_COM(enumerator->GetDefaultAudioEndpoint(eRender, eMultimedia, &device));
  CHECK_COM(device->Activate(__uuidof(IAudioClient), CLSCTX_ALL, 
    nullptr, reinterpret_cast<void**>(&client)));

  // open exclusive mode event driven stream
  CHECK_COM(client->GetDevicePeriod(&engine, &period));
  buffer_frames = static_cast<uint32_t>(period / reftimes_per_milli * sample_rate / millis_per_second);
  while((buffer_frames * frame_size) % alignment_requirement_bytes != 0) buffer_frames++;
  period = buffer_frames * millis_per_second * reftimes_per_milli / sample_rate;
  CHECK_COM(client->Initialize(AUDCLNT_SHAREMODE_EXCLUSIVE, AUDCLNT_STREAMFLAGS_EVENTCALLBACK, 
    period, period, reinterpret_cast<WAVEFORMATEX*>(&format), nullptr));  
  event_handle = CreateEvent(nullptr, FALSE, FALSE, nullptr);
  assert(event_handle != nullptr);
  CHECK_COM(client->SetEventHandle(event_handle));
  CHECK_COM(client->GetService(__uuidof(IAudioClock), reinterpret_cast<void**>(&clock)));
  CHECK_COM(client->GetService(__uuidof(IAudioRenderClient), reinterpret_cast<void**>(&render)));

  // start audio thread
  audio_thread_data data = { 0 };
  data.clock = clock;
  data.client = client;
  data.render = render;
  CreateThread(nullptr, 0, run_audio_thread, &data, 0, nullptr);

  // process user input
  // cdefg = notes, q = quit
  while((chr = _getch()) != 'q')
  {
    if(chr == 'c') note = 0;
    else if(chr == 'd') note = 1;
    else if(chr == 'e') note = 2;
    else if(chr == 'f') note = 3;
    else if(chr == 'g') note = 4;
    else continue;
    success = QueryPerformanceCounter(&qpc_count);
    note_down_msg msg;
    msg.note = note;
    msg.time_stamp_qpc = qpc_count.QuadPart;
    assert(success);
    note_msg_queue.push(msg);
    _putch(chr);
  }

  // cleanup
  stop_initiated.store(true);
  while(!stop_finished.load());
  success = CloseHandle(event_handle);
  assert(success);
}

int 
main(int argc, char** argv)
{
  // wraps COM init/cleanup
  CHECK_COM(CoInitializeEx(nullptr, COINIT_APARTMENTTHREADED));
  load_note_samples();
  run_user_input_thread();
  CoUninitialize();
  return 0;
}
Sjoerd van Kreel
  • 1,000
  • 6
  • 19
  • for now, thank you. I study this code. Thanks again. – David Feb 17 '21 at 19:21
  • found the solution. Thanks for everything and for all the time you have dedicated to me. – David Feb 18 '21 at 17:08
  • My pleasure. Just curious, what in the end was your problem with the original code?. Also, care to accept the answer? – Sjoerd van Kreel Feb 18 '21 at 18:07
  • every time I filled the fileBytes buffer with a new note, I always started from scratch, but it was wrong. I simply used a common index (wavPlaybackSample) between the wasapi thread and the FillBufferWasapi () function; and added a further variable k which always starts from zero. I hope you understand what I mean: for (i = wavPlaybackSample; i – David Feb 18 '21 at 18:26
  • I accepted your answer; thanks to infinity. – David Feb 18 '21 at 18:26
  • I was left with a doubt: but once I have added two notes to form a chord eg: C + E, if I now subtract E, I get starting C? That should be it, right? – David Mar 02 '21 at 07:59
  • Correct, but only if you subtract at the exact same sample position, from the exact same buffer position, that you used to mix it in. – Sjoerd van Kreel Mar 02 '21 at 14:23
  • I have another doubt: the use of the CPU by the wasapi thread and the function FillBufferWasapi () how is it subdivided by windows? I was wondering if every time I press a key, the FillBufferWasapi () loops finish first or, during the FillBufferWasapi () loops, the CPU usage is assigned to the wasapi thread and FillBufferWasapi () is forced to wait? – David Mar 02 '21 at 19:08
  • Assuming you're running on somewhat modern hardware i'd say you have a multicore system, right? So the audio thread and the gui thread really do run in parallel. If done right, the only time the audio thread has to wait is for wasapi to raise it's buffer-ready event. – Sjoerd van Kreel Mar 04 '21 at 08:37
  • yes i am using i7 CPU with 8 cores and windows 8.1. I did not think that there was a true parallelism of the operations as I still notice a slight latency which I think is due to the buffer size of wasapi = 160 frames. I wanted to try to halve this amount but wasapi won't let me. – David Mar 04 '21 at 10:08
  • 160 frames at standard 44.1 or 48khz is around 3 milliseconds. Thats too short to be audible. If you can really hear the delay between pressing a key and sound coming out of the speakers, there's something else going on. – Sjoerd van Kreel Mar 04 '21 at 15:44
  • Thanks for the info. I noticed that the time it takes for FillBufferWasapi () to build a chord is only 0.5ms so, the problem is not in the chord building function. I tried to connect a MIDI keyboard and if I listen to the sound of the keyboard together with that generated by the PC I hear a reverb, therefore a delay. – David Mar 04 '21 at 17:23
  • I also tested the speed of the wasapi thread and actually a buffer is released every 3.6 ms (160/44100)*1000, so here too there is no problem. Could it be the integrated sound card or the slow driver? – David Mar 05 '21 at 18:56
  • I doubt it. I get low latency from an integrated realtek interface, even in shared mode, and i dont notice any delay on my system. – Sjoerd van Kreel Mar 06 '21 at 21:41
  • I finally found out what the problem was: it was the wav samples that had a silence of 13 ms at the beginning. Now I load the samples into the buffers skipping this silence and the latency has zeroed. Thanks again for your time. – David Mar 07 '21 at 12:30
  • interesting that you can take advantage of the wasapi thread with its precise timing, to store in a temporary buffer what is played and write the temporary buffer at the end in wav files, it is as if you were recording directly from the sound card: is this correct? – David Mar 10 '21 at 12:24
  • I don't follow? If you mean exclusive mode capturing then yes, there's very little between your code and the audio driver. – Sjoerd van Kreel Mar 11 '21 at 10:22
  • maybe I explained myself wrong. If while playing I store up an additional (bufferXYZ) the same data that I send to wasapi with ReleaseBuffer (), in the end I get a (bufferXYZ) that contains a faithful recording to the original, right? By doing this, you don't need to use programs like Audacity to record what comes out of your sound card. – David Mar 11 '21 at 10:48
  • Correct, just be sure to not do I/O on the audio thread. – Sjoerd van Kreel Mar 11 '21 at 10:58
  • thank you. I do as you suggest. I save the bufferXYZ in a .wav file only at the end, so I don't have any kind of delay. – David Mar 11 '21 at 11:25
  • I take this opportunity for another question: when I add notes to form chords, sometimes I hear a click, how can this be solved? – David Mar 11 '21 at 13:10
  • Might be you're creating samples above the maximum amplitude (distortion). – Sjoerd van Kreel Mar 12 '21 at 12:49
  • there is a way to write compactly: int tmp = 34456; if (tmp> 32767) tmp = 32767; else if (tmp <-32768) tmp = -32768; – David Mar 15 '21 at 19:43