0

I'm writing a little Console-Game-Engine and for better performance I wanted 2 threads (or more but 2 for this task) using two buffers. One thread is drawing the next frame in the first buffer while the other thread is reading the current frame from the second buffer. Then the buffers get swapped.

Of cause I can only swap them if both threads finished their task and the drawing/writing thread happened to be the one waiting. But the time it is waiting systematicly switches more or less between two values, here a few of the messurements I made (in microseconds):

0, 36968, 0, 36260, 0, 35762, 0, 38069, 0, 36584, 0, 36503

It's pretty obvious that this is not a coincidence but I wasn't able to figure out what the problem was as this is the first time I'm using threads.

Here the code, ask for more if you need it, I think it's too much to post it all:

header-file (Manager currently only adds a pointer to my WinAppBase-class):

class SwapChain : Manager
{
    WORD                            *pScreenBuffer1, *pScreenBuffer2, *pWritePtr, *pReadPtr, *pTemp;
    bool                            isRunning, writingFinished, readingFinished, initialized;
    std::mutex                      lockWriting, lockReading;
    std::condition_variable         cvWriting, cvReading;
    DWORD                           charsWritten;
    COORD                           startPosition;
    int                             screenBufferWidth;


    // THREADS (USES NORMAL THREAD AS SECOND THREAD)
    void ReadingThread();

    // THIS FUNCTION IS ONLY FOR INTERN USE
    void SwapBuffers();

public:
    // USE THESE TO CONTROL WHEN THE BUFFERS GET SWAPPED
    void BeginDraw();
    void EndDraw();

    // PUT PIXEL | INLINED FOR BETTER PERFORMANCE
    inline void PutPixel(short xPos, short yPos, WORD color)
    {
        this->pWritePtr[(xPos * 2) + yPos * screenBufferWidth] = color;
        this->pWritePtr[(xPos * 2) + yPos * screenBufferWidth + 1] = color;

    }

    // GENERAL CONTROL OVER SWAP CHAIN
    void Initialize();
    void Run();
    void Stop();

    // CONSTRUCTORS
    SwapChain(WinAppBase * pAppBase);
    virtual ~SwapChain();
};

Cpp-file

SwapChain::SwapChain(WinAppBase * pAppBase)
    :
    Manager(pAppBase)
{
    this->isRunning             =   false;
    this->initialized           =   false;
    this->pReadPtr              =   NULL;
    this->pScreenBuffer1        =   NULL;
    this->pScreenBuffer2        =   NULL;
    this->pWritePtr             =   NULL;
    this->pTemp                 =   NULL;
    this->charsWritten          =   0;
    this->startPosition         =   { 0, 0 };
    this->readingFinished       =   0;
    this->writingFinished       =   0;
    this->screenBufferWidth     =   this->pAppBase->screenBufferInfo.dwSize.X;
}

SwapChain::~SwapChain()
{
    this->Stop();

    if (_CrtIsValidHeapPointer(pReadPtr))
        delete[] pReadPtr;

    if (_CrtIsValidHeapPointer(pScreenBuffer1))
        delete[] pScreenBuffer1;

    if (_CrtIsValidHeapPointer(pScreenBuffer2))
        delete[] pScreenBuffer2;

    if (_CrtIsValidHeapPointer(pWritePtr))
        delete[] pWritePtr;
}

void SwapChain::ReadingThread()
{
    while (this->isRunning)
    {
        this->readingFinished = 0;

        WriteConsoleOutputAttribute(
            this->pAppBase->consoleCursor,
            this->pReadPtr,
            this->pAppBase->screenBufferSize,
            this->startPosition,
            &this->charsWritten
        );
        memset(this->pReadPtr, 0, this->pAppBase->screenBufferSize);

        this->readingFinished = true;
        this->cvWriting.notify_all();

        if (!this->writingFinished)
        {
            std::unique_lock<std::mutex> lock(this->lockReading);
            this->cvReading.wait(lock);
        }
    }
}

void SwapChain::SwapBuffers()
{
    this->pTemp     =   this->pReadPtr;
    this->pReadPtr  =   this->pWritePtr;
    this->pWritePtr =   this->pTemp;
    this->pTemp     =   NULL;
}

void SwapChain::BeginDraw()
{
    this->writingFinished = false;
}

void SwapChain::EndDraw()
{
    TimePoint tpx1, tpx2;

    tpx1 = Clock::now();
    if (!this->readingFinished)
    {
        std::unique_lock<std::mutex> lock2(this->lockWriting);
        this->cvWriting.wait(lock2);
    }
    tpx2 = Clock::now();
    POST_DEBUG_MESSAGE(std::chrono::duration_cast<std::chrono::microseconds>(tpx2 - tpx1).count(), "EndDraw wating time");

    SwapBuffers();
    this->writingFinished = true;
    this->cvReading.notify_all();
}

void SwapChain::Initialize()
{
    if (this->initialized)
    {
        POST_DEBUG_MESSAGE(Result::CUSTOM, "multiple initialization");
        return;
    }

    this->pScreenBuffer1 = (WORD *)malloc(sizeof(WORD) * this->pAppBase->screenBufferSize);
    this->pScreenBuffer2 = (WORD *)malloc(sizeof(WORD) * this->pAppBase->screenBufferSize);

    for (int i = 0; i < this->pAppBase->screenBufferSize; i++)
    {
        this->pScreenBuffer1[i] = 0x0000;
    }
    for (int i = 0; i < this->pAppBase->screenBufferSize; i++)
    {
        this->pScreenBuffer2[i] = 0x0000;
    }

    this->pWritePtr = pScreenBuffer1;
    this->pReadPtr = pScreenBuffer2;

    this->initialized = true;
}

void SwapChain::Run()
{
    this->isRunning = true;
    std::thread t1(&SwapChain::ReadingThread, this);

    t1.detach();
}

void SwapChain::Stop()
{
    this->isRunning = false;
}

This is where I run the SwapChain-class from:

void Application::Run()
{
    this->engine.graphicsmanager.swapChain.Initialize();

    Sprite<16, 16> sprite(&this->engine);
    sprite.LoadSprite("engine/resources/TestData.xml", "root.test.sprites.baum");

    this->engine.graphicsmanager.swapChain.Run();

    int a, b, c;

    for (int i = 0; i < 60; i++)
    {
        this->engine.graphicsmanager.swapChain.BeginDraw();

        for (c = 0; c < 20; c++)
        {

            for (a = 0; a < 19; a++)
            {
                for (b = 0; b < 10; b++)
                {
                    sprite.Print(a * 16, b * 16);
                }
            }

        }

        this->engine.graphicsmanager.swapChain.EndDraw();
    }

    this->engine.graphicsmanager.swapChain.Stop();


    _getch();
}

The for-loops above simply draw the sprite 20 times from the top-left corner to the bottom-right corner of the console - the buffers don't get swapped during that, and that again for a total of 60 times (so the buffers get swapped 60 times). sprite.Print uses the PutPixel function of SwapChain.

Here the WinAppBase (which consits more or less of global-like variables)

class WinAppBase
{
public:

    // SCREENBUFFER
    CONSOLE_SCREEN_BUFFER_INFO      screenBufferInfo;
    long                            screenBufferSize;

    // CONSOLE
    DWORD                           consoleMode;
    HWND                            consoleWindow;
    HANDLE                          consoleCursor;
    HANDLE                          consoleInputHandle;
    HANDLE                          consoleHandle;
    CONSOLE_CURSOR_INFO             consoleCursorInfo;
    RECT                            consoleRect;
    COORD                           consoleSize;

    // FONT
    CONSOLE_FONT_INFOEX             fontInfo;

    // MEMORY
    char *                          pUserAccessDataPath;



public:
    void reload();

    WinAppBase();
    virtual ~WinAppBase();
};

There are no errors, simply this alternating waitng time. Maybe you'd like to start by looking if I did the synchronisation of the threads correctly? I'm not exactly sure how to use a mutex or condition-variables so it might comes from that.

Apart from that it is working fine, the sprites are shown as they should.

Anyfail
  • 29
  • 3
  • There are a tons of data races in your code. For example, all your bools that are accessed from different threads should be `std::atomic` instead, but that alone won't fix it. You should also take into account that between the `if (...)` and the code you conditionally execute, your condition might be changed by the other thread. – Max Langhof May 23 '18 at 13:06
  • Okay thanks I'll change it to std::atomic then - do I have to do something like that for every resource that is used by multiple threads or only for those which could possibly be used at the same time? – Anyfail May 23 '18 at 13:10
  • "You should also take into account that between the if (...) and the code you conditionally execute, your condition might be changed by the other thread" Do have an idea how to fix that? – Anyfail May 23 '18 at 13:18
  • Using condition variables is already a good idea in your case. However, you should follow the description [here](http://en.cppreference.com/w/cpp/thread/condition_variable) by only modifiying the share state while you are holding a mutex specific to that state (i.e. variable). This then prevents more than one thread from accessing (or making assumptions about) the state at once. The only remaining `atomic` you need would then be for `isRunning`. The key thing to consider is that execution may change between the two threads at any point. – Max Langhof May 23 '18 at 13:26

1 Answers1

0

The clock you are using may have limited resolution. Here is a random example of a clock provided by Microsoft with 15 ms (15000 microsecond) resolution: Why are .NET timers limited to 15 ms resolution?

If one thread is often waiting for the other, it is entirely possible (assuming the above clock resolution) that it sometimes waits two clockticks and sometimes none. Maybe your clock only has 30 ms resolution. We really can't tell from the code. Do you get more precise measurements elsewhere with this clock?

There are also other systems in play such as the OS scheduler or whatever controls your std::threads. That one is (hopefully) much more granular, but how all these interactions play out doesn't have to be obvious or intuitive.

Max Langhof
  • 23,383
  • 5
  • 39
  • 72
  • I'm using std::chrono::high_resolution_clock I don't know how precise it is and how much time it spends on storing the time but as it uses microseconds I think it should be pretty accurate. I've only shown a few of the messurements but it really keeps throughout the whole 60 loops switching, there is not a single time where it's two times 0 or two times over 30000 – Anyfail May 23 '18 at 13:26
  • You are timing how long it takes to wait for the lock to be acquired and the condition variable to be notified if reading hasn't finished yet. Presumably, you get 0 seconds when `readingFinished == true` and a time depending on the lock/cv granularity otherwise. That this alternates may be a result of your locking pattern and thread scheduling. It's hard to tell. – Max Langhof May 23 '18 at 13:29
  • Okay thanks, I'll try out your advices from above now and look if the outcome changes - otherwise I might just accept the way it is behaving as it already does what it's supposed to – Anyfail May 23 '18 at 13:42