2

While learning Vulkan, I frequently run into code like this:

vk::raii::Semaphore imageAvailable = device.createSemaphore(...);
vk::raii::Semaphore renderingFinished = device.createSemaphore(...);

while (shouldRun) {
    // Platform loop

    auto [result, imageIndex] = swapchain.acquireNextImage(UINT64_MAX, *imageAvailable, {});
    queue.submit(
        vk::SubmitInfo(*imageAvailable, waitDstStage, *commandBuffers[imageIndex], 
    *renderingFinished), {});
    queue.presentKHR(vk::PresentInfoKHR(*renderingFinished, *swapchain, imageIndex));
}

In this example, we ask the presentation engine to provide us with the index of the next available image and signal imageAvailable when we the image comes available for us to use. We then submit our pre-recorded command buffer which draws to its corresponding image (for simplicity, I record one command buffer per swapchain image at initialization and draw the same thing every frame) and require that imageAvailable become signaled before execution (reaches waitDstStage), and ask that renderingFinished be signaled when execution completes. Finally, we hand the image back to the presentation engine and ask it to present the image once renderingFinished becomes signaled.

Does this potentially exhibit undefined behavior due to improper synchronization?

If I understand the spec, I can imagine a scenario in which two submit calls are simultaneously waiting on imageAvailable: None of the calls in the main loop above is required to block at all... and so it could potentially execute two calls to vkQueueSubmit, both waiting on imageAvailable, before imageAvailable has been signalled. In my head, when it does become signaled, both batches would execute, and one would potentially access an image before the presentation engine has released it to us. However, in theory, this can't happen:

If 'semaphore' is not 'VK_NULL_HANDLE', the semaphore must be unsignaled, with no signal or wait operations pending. It will become signaled when the application can use the image.

After the first call to vkQueueSubmit, the semaphore could have a wait operation pending. At the second call, the wait operation could still be pending. This would result in undefined behavior, and my Vulkan program is at fault for any application misbehavior.

Firstly, am I to understand that I may treat vkQueuePresentKHR as a "command" (or at least executing commands) within a queue, and I can apply synchronization rules to it? Secondly, am I correct in assuming that I require additional synchronization in the above code to avoid potentially invoking undefined behavior?

Joshua Hyatt
  • 1,300
  • 2
  • 11
  • 22

1 Answers1

3

You do not require "additional synchronization;" you require additional semaphores.

You can potentially have outstanding synchronizations for each image in your swapchain. As such, you should have one set of semaphores for each image. You don't have to number them by index or anything; it's fine to just put them in a circular buffer. But the main thing is that you cannot acquire with a semaphore (or do any other signal operations with it) until that semaphore has been unsignaled by a prior queue submission operation.

With a circular buffer of semaphores, you won't run into this problem. Since your rendering operation will typically need to be multiply buffered as well (so that you aren't writing to scratch memory that's still being read from), this is just one more thing that needs to be buffered.

And since any multiply-buffered system has limits on the maximum number of buffers in flight, you would naturally guard it with fences. To reuse a semaphore (or any other buffer for that frame) requires checking the fence associated with the waiting semaphore queue operation/consuming operation for the frame. This fence will be several frames old, so it should almost certainly be set. And if it isn't, then the CPU has gotten ahead of the GPU and needs to slow down if it wants to keep drawing stuff.

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
  • Why does this solve the problem if none of the three calls in the main loop block? If `vkAcquireNextImageKHR` (or `vkQueuePresentKHR`) is *never* required to block, couldn't I theoretically "acquire" an image several times? If so, one set of semaphores per image wouldn't be sufficient. – Joshua Hyatt Oct 03 '22 at 20:38
  • 1
    @JoshuaHyatt: `vkAcquireNextImageKHR` will not return an image that has already been acquired. If you run out of images in the swap chain (ie: they are not yet finished being presented), then `vkAcquireNextImageKHR` will either block (pursuant to the time you gave it) or return a failure to acquire an image with `VK_NOT_READY`. – Nicol Bolas Oct 03 '22 at 21:12
  • In that case, I am probably misinterpreting the spec: `Applications should not rely on vkAcquireNextImageKHR blocking in order to meter their rendering speed. *The implementation may return from this function immediately regardless of how many presentation requests are queued, and regardless of when queue presentation requests will complete relative to the call.*` I read this as "image acquisition *may* never fail, since the implementation may return the index of an in-use image." Is this interpretation incorrect? How does the spec specify only `imageCount` images can be acquired? – Joshua Hyatt Oct 03 '22 at 21:35
  • @JoshuaHyatt: Where are you getting "return the index of an in-use image" from? That paragraph says *nothing* about what the index will be. "*How does the spec specify only imageCount images can be acquired?*" Because only `imageCount` images *exist* in the swapchain. When you created a swapchain, you got `imageCount` `VkImage` objects with it. No more, no less. – Nicol Bolas Oct 03 '22 at 21:42
  • I was under the impression that `vkAcquireNextImageKHR` returns once the presentation engine knows which image client can use next, but that the image doesn't have to be available *yet* -- which is what the semaphore's for: essentially, "use swapchain image 'x', but not until I signal your semaphore". Reading [your answer here](https://stackoverflow.com/a/60420169/3053072) gave me the same impression: CPU could be requesting to acquire image #900 before GPU has finished work on frame #5 (@Jherico's comment below your answer). How is that possible with only a few (<900) semaphores? – Joshua Hyatt Oct 03 '22 at 23:47
  • Thank you for the clarification, your answer is very comprehensive. – Joshua Hyatt Oct 04 '22 at 02:34