I started to look into multi-threading the render loop because I wanted to get around the issue of the client area needing to be updated with a user holding the window bar or holding the mouse on a border to resize the window without moving the mouse. The render loop redraws the content of the window using either Vulkan or DirextX. So it uses a swap chain and the present mechanism to get the latest frame displayed to the screen.
I more or less got this working (more on this later). But I also want to ensure the rendering is as clean as possible when the window is resized. As you all know if you have dealt with this before, Windows generally draws garbage where new pixels are exposed when you resize (enlarge the window).
That seems to be a recurring question on Stackoverflow, and they are rather well-written and interesting answers such as this one that goes into a lot of depth already:
However:
As mentioned by the answer's authors, he/she made mostly guesses about what Windows is doing. So I would be interested to see an answer to this problem that's not just based on guess (if someone has that knowledge).
The answers provide insights but no solution.
Furthermore, the question is not being specifically asked within the context of a multi-threaded app.
Now the answers suggest that Windows does probably request a user to redraw the client area within 1/60th of a second (assuming a 60Hz refresh rate for the screen). My understanding from the various posts is also that in Windows (which I don't know well I have to say), it's better to catch the WM_WINDOWPOSCHANGING event rather than WM_SIZE OR WM_SIZING as Windows would start to wait for you to redraw the client area before doing it itself when you return from WM_WINDOWPOSCHANGING.
So my solution was to do this in the main thread (the main thread that deals with Windows messages):
case WM_WINDOWPOSCHANGING:
{
std::unique_lock lock(m);
cv.wait(lock, []() { return is_drawing == false; });
is_resizing = true;
RECT client_rect;
GetClientRect(hwnd2, &client_rect);
window_width = client_rect.right - client_rect.left;
window_height = client_rect.bottom - client_rect.top;
draw();
is_resizing = false;
lock.unlock();
cv.notify_one();
return 0;
}
and that in the render thread:
void render_func()
{
while (keep_running)
{
std::unique_lock lock(m);
cv.wait(lock, []() { return is_resizing == false; });
is_drawing = true;
draw();
is_drawing = false;
lock.unlock();
cv.notify_one();
}
}
void draw()
{
if (size_changed)
recreate_swapchain(window_width , window_height);
acquire_texture_view_from_swapchain();
do_GPU_magic();
present(); // swap buffers
}
I am using a condition_variable
to get WindowProc
to wait if the render thread is drawing. If the render thread is not drawing then we compute the client area size then force a draw to be sure that drawing happens before we return from WindowProc
and then we set a is_resizing flag to false to signal the render thread that it can resume with rendering normally.
The reason why I came with all these loops is because it was my understanding while reading the referenced post, that windows was expecting you to redraw the client area in roughly 16 ms and that if you were not doing it in that timeframe, then it would do it for you (with whatever means it could come up with: background color, garbage, etc.). So forcing a draw() call before we would return from WindowProc
should allow that. I also understand that with DWM, the redraw by Windows is asynchronous. And so it seems that you don't know for sure, "when will Windows actually paint into the client area".
To be sure the draw call was super quick, the only think I do is clear the buffer with a plain color. The background color at the window's creation is red. When I call draw() from WindowProc
I set the bg color of the buffer rendered via Vulkan / DirectX to blue. When the windows' content is rendered while the render thread is running, the bg color is set to green.
Also the present mode is set to immediate. Meaning the buffer should be presented to the surface as soon as possible. So I should really get under the 16 ms requirement (I timed 465 microseconds for the entire process).
Interestingly I do get "expected" results. It's green when I do nothing, and blue when I resize the window. Also the redraw is perfectly smooth with not garbage redrawn and of course the content is redrawn even when I move the window. Super.
Excepted that, I occasionally get a fully red window. Somehow this means that sometimes, that code misses to draw something at the right time and windows decides to draw the entire client area with the windows initial bg color.
I have no idea what else to try, and it feels already quite hacky. It seems like I am in muddy territory here; not sure it's even possible reliably (because of that asynchronous DWM process). I don't think many people have needed to tackle that problem before, even though in 2023, I'd think this would be a common requirement.
Do you have feedback on the approach I have chosen? Have you managed to get to work somehow? If you could share your solution, it would be greatly appreciated.
EDIT 1
Following @SimonMourier's request here some code. I can't share the D3D12 code for business reasons but I can share an example I put together using Dawn which we have been testing internally. Not sure this is any useful if you haven't the Dawn libs at hand, but they are not difficult to build. Here is the code:
#include <Windows.h>
#include <iostream>
#include <thread>
#include <chrono>
#include "dawn/webgpu_cpp.h"
#include "dawn/dawn_proc.h"
#include "dawn/native/DawnNative.h"
#include <cassert>
#include <semaphore>
#include <mutex>
dawn_native::Instance instance;
wgpu::Device device;
wgpu::Queue queue;
wgpu::Surface surface;
wgpu::SwapChain swapChain;
std::atomic<uint32_t> window_width { 640 };
std::atomic<uint32_t> window_height { 480 };
using namespace std::chrono_literals;
std::atomic<bool> keep_running = true;
std::mutex m;
#ifndef UNICODE
#define UNICODE
#endif
HWND hwnd2;
void draw(float r, float g, float b);
bool is_resizing = false;
bool is_drawing = true;
std::condition_variable cv;
wgpu::PresentMode present_mode = wgpu::PresentMode::Fifo;
LRESULT CALLBACK WindowProc(HWND hwnd, UINT uMsg, WPARAM wParam, LPARAM lParam)
{
switch (uMsg)
{
case WM_DESTROY:
PostQuitMessage(0);
keep_running = false;
return 0;
case WM_WINDOWPOSCHANGING:
case WM_WINDOWPOSCHANGED:
{
if (!surface)
return 0;
RECT client_rect;
GetClientRect(hwnd2, &client_rect);
window_width = client_rect.right - client_rect.left;
window_height = client_rect.bottom - client_rect.top;
#ifdef MULTITHREADED
std::unique_lock lock(m);
cv.wait(lock, []() { return is_drawing == false; });
is_resizing = true;
#endif
draw(0,0,1);
#ifdef MULTITHREADED
is_resizing = false;
lock.unlock();
cv.notify_one();
#endif
return 0;
}
}
return DefWindowProc(hwnd, uMsg, wParam, lParam);
}
void create_window()
{
const wchar_t CLASS_NAME[] = L"MyWindowClass";
WNDCLASS wc = {};
wc.lpfnWndProc = WindowProc;
wc.hInstance = GetModuleHandle(nullptr);
wc.lpszClassName = CLASS_NAME;
wc.hbrBackground = CreateSolidBrush(RGB(255, 0, 0));
wc.style = CS_HREDRAW | CS_VREDRAW;
RegisterClass(&wc);
DWORD dwStyle = WS_OVERLAPPEDWINDOW;
RECT rc = { 0, 0, (int32_t)window_width, (int32_t)window_height };
AdjustWindowRectEx(&rc, dwStyle, FALSE, 0);
hwnd2 = CreateWindowEx(
0,
CLASS_NAME,
L"",
dwStyle,
CW_USEDEFAULT, CW_USEDEFAULT,
rc.right - rc.left, rc.bottom - rc.top,
NULL,
NULL,
GetModuleHandle(nullptr),
NULL);
assert(hwnd2 != nullptr);
ShowWindow(hwnd2, SW_SHOW);
}
// END OF WINDOWS STUFF
std::unique_ptr<wgpu::ChainedStruct> SetupWindowAndGetSurfaceDescriptor() {
std::unique_ptr<wgpu::SurfaceDescriptorFromWindowsHWND> desc =
std::make_unique<wgpu::SurfaceDescriptorFromWindowsHWND>();
desc->hwnd = hwnd2;
desc->hinstance = GetModuleHandle(nullptr);
return std::move(desc);
}
wgpu::Surface CreateSurfaceForWindow(const wgpu::Instance& instance) {
std::unique_ptr<wgpu::ChainedStruct> chainedDescriptor =
SetupWindowAndGetSurfaceDescriptor();
wgpu::SurfaceDescriptor descriptor;
descriptor.nextInChain = chainedDescriptor.get();
wgpu::Surface surface = instance.CreateSurface(&descriptor);
return surface;
}
void init()
{
instance.DiscoverDefaultAdapters();
std::vector<dawn::native::Adapter> adapters = instance.GetAdapters();
auto adapterIt = std::find_if(adapters.begin(), adapters.end(),
[](const dawn::native::Adapter adapter) -> bool {
wgpu::AdapterProperties properties;
adapter.GetProperties(&properties);
return properties.backendType == wgpu::BackendType::Vulkan;
});
if (adapterIt == adapters.end()) {
return;
}
dawn::native::Adapter chosenAdapter = *adapterIt;
DawnProcTable procs(dawn_native::GetProcs());
dawnProcSetProcs(&procs);
device = wgpu::Device::Acquire(chosenAdapter.CreateDevice());
queue = device.GetQueue();
surface = CreateSurfaceForWindow(instance.Get());
wgpu::SwapChainDescriptor swapChainDesc = {
.usage = wgpu::TextureUsage::RenderAttachment,
.format = wgpu::TextureFormat::BGRA8Unorm,
.width = window_width,
.height = window_height,
.presentMode = present_mode,
};
swapChain = device.CreateSwapChain(surface, &swapChainDesc);
}
int w = 0, h = 0;
void draw(float r, float g, float b)
{
if (!surface)
return;
if (w != window_width || h != window_height) {
w = window_width, h = window_height;
wgpu::SwapChainDescriptor swapChainDesc = {
.usage = wgpu::TextureUsage::RenderAttachment,
.format = wgpu::TextureFormat::BGRA8Unorm,
.width = window_width,
.height = window_height,
.presentMode = present_mode,
};
swapChain = device.CreateSwapChain(surface, &swapChainDesc);
}
wgpu::TextureView backBuffer = swapChain.GetCurrentTextureView();
wgpu::CommandEncoder encoder = device.CreateCommandEncoder();
wgpu::RenderPassColorAttachment renderPassColorAttachment = {
.view = backBuffer,
.resolveTarget = nullptr,
.loadOp = wgpu::LoadOp::Clear,
.storeOp = wgpu::StoreOp::Store,
.clearValue = {r, g, b ,1},
};
wgpu::RenderPassDescriptor renderPassDescriptor = {
.colorAttachmentCount = 1,
.colorAttachments = &renderPassColorAttachment,
.depthStencilAttachment = nullptr,
};
wgpu::RenderPassEncoder pass = encoder.BeginRenderPass(&renderPassDescriptor);
pass.End();
wgpu::CommandBuffer command = encoder.Finish();
queue.Submit(1, &command);
swapChain.Present();
}
void render_func()
{
while (keep_running)
{
std::unique_lock lock(m);
cv.wait(lock, []() { return is_resizing == false; });
is_drawing = true;
draw(0,rand() / (float)RAND_MAX,0);
is_drawing = false;
lock.unlock();
cv.notify_one();
// give a chance to the event loop to process messages
std::this_thread::sleep_for(4ms);
}
}
void handleWindowsEvent(HWND hwnd, UINT uMsg, WPARAM wParam, LPARAM lParam)
{}
int main()
{
create_window();
init();
#ifdef MULTITHREADED
std::thread render_thread(render_func);
#endif
MSG msg = {};
while (keep_running)
{
if (PeekMessage(&msg, hwnd2, 0U, 0U, PM_REMOVE)) {
TranslateMessage(&msg);
DispatchMessage(&msg);
}
#ifndef MULTITHREADED
draw(0,rand() / (float)RAND_MAX,0);
#endif
}
return 0;
}
Compiled with:
clang++ -O3 -std=c++20 -I$DAWN_PATH/dawn/include -I$DAWN_PATH/dawn/out/Release/gen/include -I$DAWN_PATH/dawn/src/dawn/ -L$DAWN_PATH/dawn/out/Release -ldawn_native.dll -ldawn_proc.dll -ldawn_platform.dll webgpu_cpp.o Source.cpp -DMULTITHREADED -lUser32 -lGdi32 -DUNICODE
Note: I had to put a small sleep in the render loop otherwise the event loop never gets a chance to grab the lock. Now if you run this, and resize the window, you will see that you get green when you don't resize, blue when you resize, and red sometimes.
I looked into the composition mechanism but not sure where it would fit into a program where you use a real-time backend to draw to the screen?
Also to be clear, the problems I need to solve:
be sure the content is refreshed (render loop still goes) when a user moves the window or holds resize position. In the single-threaded model, Windows blocks the refresh due to Modal architecture.
Ensure Windows does a clean refresh and does not draw garbage in the newly exposed pixels when you enlarge the window.
Interestingly, I don't see this garbage when you use the single-threaded option in this program, even when I resize the window rather quickly.
Anyway, if this can be done using a single-threaded approach and whatever new modern architecture available on Windows, I'd be super keen on using it.