Why is a VkFence necessary for every swapchain command buffer when already using semaphores?

Question

I am using exactly 3 images for the swapchain and one VkCommandBuffer (CB) per swapchain image. GPU synchronization is done with two semaphores, one for presentation_finished and one for rendering_finished. The present mode is VK_PRESENT_MODE_MAILBOX_KHR (quick overview).

Now when I am running my example without waiting for any CB fences, the validation layers report this error as soon as any swapchain image is used for the second time:

Calling vkBeginCommandBuffer() on active CB before it has completed. You must check CB fence before this call.

At first sight it seems resonable to me, as processing the commands from the CB might just not be finished yet. But the more I think about it the more I come conclude that this should not happen at all.

My current understanding is when vkAcquireImageKHR returns a specific image index it implies that the image returned has to be finished with rendering.

That is because I'm passing the rendering_finished semaphore to vkQueueSubmit to be signaled when rendering finises and to vkQueuePresentKHR to wait until it becomes signaled before presenting the image.

The specification of VkQueuePresentInfoKHR says:

pWaitSemaphores, if not VK_NULL_HANDLE, is an array of VkSemaphore objects with waitSemaphoreCount entries, and specifies the semaphores to wait for before issuing the present request

Meaning: I will never present any image which hasn't finshed rendering and thus the associated CB cannot be in use any more as soon as the image is presented.

The second semaphore presentation_finished is signalled by vkAqcuireImageKHR and passed to the same vkQueueSubmit (to start the rendering). This means the rendering of any image will start no earlier than allowed by the presentation engine.

To conclude: A present request from vkQueuePresentKHR is not issued before the rendering of the image is finished and vkAcquireImageKHR blocks until an image is available and also will never return an images currently acquired.

What am I missing that makes a fence necessary?

I have included a minimal code example containing only the conceptually important parts to illustrate the problem.

VkImage[] swapchain_images;
VkCommandBuffer[] command_buffers;

VkSemaphore rendering_finished;
VkSemaphore presentation_finished;

void RenderLoop()
{
    /* Acquire an image from the swapchain. Block until one is available.
       Signal presentation_finished when we are allowed to render into the image */
    int index;
    vkAcquireImageKHR(device, swapchain, UINT64_MAX, presentation_finished, nullptr, &index);

    /* (...) Frambuffer creation, etc. */

    /* Begin CB: The command pool is flagged to reset the command buffer on reuse */
    VkCommandBuffer cb = command_buffers[index];
    vkBeginCommandBuffer(cb, ...);

    /* (...) Trivial rendering of a single color image */

    /* End CB */
    vkEndCommandBuffer(cb);


    /* Queue the rendering and wait for presentation_finished.
       When rendering is finished, signal rendering_finished.

       The VkSubmitInfo will have these important members set among others:
       .pWaitSemaphores = &presentation_finished;
       .pSignalSemaphores = &rendering_finished;
    */
    vkQueueSubmit(render_queue, &submit_info);

    /* Submit the presentation request as soon as the rendering_finished
       semaphore gets signalled

       The VkPresentInfoKHR will have these important members set among others:
       .pWaitSemaphores = &rendering_finished;
    */
    vkQueuePresentKHR(present_queue, &present_info);
}

Inserting a fence when submitting the CB to the rendering queue and waiting on it before using that CB again obviously fixes the issue, but - as explained - seems redundant.

score 23 · Accepted Answer · edited Jun 20 '20 at 09:12

vkAcquireNextImageKHR is allowed to return an image that is still the destination and/or source of ongoing asynchronous operations. This means you have no guarantee the command buffer is available at time of reuse. It would be correct to enqueue additional, distinct command buffers to write to the acquired image, as long as those commands are configured to wait on the presentation_finished semaphore; but to safely reuse that command buffer you must wait on the fence passed to vkQueueSubmit.

See section 29.6. WSI Swapchain in the Vulkan spec with KHR extensions:

An application can acquire use of a presentable image with vkAcquireNextImageKHR. After acquiring a presentable image and before modifying it, the application must use a synchronization primitive to ensure that the presentation engine has finished reading from the image. The application can then transition the image’s layout, queue rendering commands to it, etc. Finally, the application presents the image with vkQueuePresentKHR, which releases the acquisition of the image.

See also these notes for vkAcquireNextImageKHR

When successful, vkAcquireNextImageKHR acquires a presentable image that the application can use, and sets pImageIndex to the index of that image. The presentation engine may not have finished reading from the image at the time it is acquired, so the application must use semaphore and/or fence to ensure that the image layout and contents are not modified until the presentation engine reads have completed.

[...]

As mentioned above, the presentation engine may be asynchronous with respect to the application and/or logical device. vkAcquireNextImageKHR may return as soon as it can identify which image will be acquired, and can guarantee that semaphore and fence will be signaled by the presentation engine; and may not successfully return sooner. The application uses timeout to specify how long vkAcquireNextImageKHR waits for an image to become acquired.

This shows that vkAcquireNextImageKHR is not required to block on the presentation operation, and transitively is not required to block on the graphics command that the presentation operation is itself waiting on.

Why is a VkFence necessary for every swapchain command buffer when already using semaphores?

1 Answers1