Render target write/ shader read synchronization between different render passes

Question

Here is a Vulkan rendering setup:

Submit CB1 - contains render pass A with X draw calls.
Submit CB2 - contains render pass B with Y draw calls where one of the draw calls samples from image which is render target of render pass A.
During submission of CB2, a semaphore, which is shared with some external GPU API, signaling inserted to make sure CB2 execution is done before the render target of render pass B is used further (By CUDA in this case).This step is set correct and it is clear to me how it works.

All this happens on the same queue and in the above specified order. Render pass A and B share MSAA image,but each has unique color attachment into which MSAA is resolved.

My question is what is the best way to synchronize between CB1 finishing writing to render pass A RT,and one or more draw calls in CB2 sampling from that RT in shader during render pass B execution?

Based on some suggestion I received on Khronos Vulkan Slack group I tried synchronization via subpass dependencies.

Render pass A dependency setup :

    VkSubpassDependency dependencies[2] = {};

    dependencies[0].srcSubpass = VK_SUBPASS_EXTERNAL;
    dependencies[0].dstSubpass = 0;
    dependencies[0].srcStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT;
    dependencies[0].dstStageMask = VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT; 
    dependencies[0].srcAccessMask = VK_ACCESS_MEMORY_READ_BIT;
    dependencies[0].dstAccessMask = VK_ACCESS_INPUT_ATTACHMENT_READ_BIT;
    dependencies[0].dependencyFlags = VK_DEPENDENCY_BY_REGION_BIT;
    VkAttachmentReference colorReferences[2] = {};
    colorReferences[0] = { 0, VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL }; //this one for MSAA attachment
    colorReferences[1] = { 1, VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL };

Render pass B dependency setup :

    VkSubpassDependency dependencies[2] = {};

    dependencies[0].srcSubpass = VK_SUBPASS_EXTERNAL;
    dependencies[0].dstSubpass = 0;
    dependencies[0].srcStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT;
    dependencies[0].dstStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT;
    dependencies[0].srcAccessMask = VK_ACCESS_MEMORY_READ_BIT;
    dependencies[0].dstAccessMask = VK_ACCESS_COLOR_ATTACHMENT_READ_BIT | VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT;
    dependencies[0].dependencyFlags = VK_DEPENDENCY_BY_REGION_BIT;
    VkAttachmentReference colorReferences[2] = {};
    colorReferences[0] = { 0, VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL }; 
    colorReferences[1] = { 1, VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL };

The above solution seems to be working. But I am not sure I have implicit synchronization guarantee about it. In this discussion one of the answers states

The only images where that isn’t the case is dependencies for render targets. And those are treated specially by subpasses anyway. Each subpass says what it is writing and to where, so the system has the information to build those memory dependencies explicitly.

Also in this article the author writes:

NOTE: Frame buffer operations inside a render pass happen in API-order, of course. This is a special exception which the spec calls out.

But in this SO question, the answer is that a sync between command buffers must be done, with events in that case.

So another option I have tried was to insert pipeline memory barrier during CB2 recording for images which are render targets in previously submitted CB (render pass A),before beginning of render pass B recording:

CB2 recording:

  BeginCommandBuffer(commandBuffer);
   ...
  if (vulkanTex->isRenderTarget)//for each sampler in this pass
  {
    VkImageMemoryBarrier imageMemoryBarrier = CreateImageBarrier();
    imageMemoryBarrier.srcAccessMask = VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT;
    imageMemoryBarrier.dstAccessMask = VK_ACCESS_SHADER_READ_BIT;
    imageMemoryBarrier.oldLayout = VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL;
    imageMemoryBarrier.newLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL;
    imageMemoryBarrier.image = vulkanTex->image;
    VkImageSubresourceRange range = { VK_IMAGE_ASPECT_COLOR_BIT, 0, 1, 0, 1 };
    imageMemoryBarrier.subresourceRange = range;

    vkCmdPipelineBarrier(
        commandBuffer,
        VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT,
        VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT,
        0,
        0, nullptr,
        0, nullptr,
        1, &imageMemoryBarrier);
   }

 VkRenderPassBeginInfo renderPassBeginInfo = {};
 ...
 .....

Of course in this scenario, I can set subpass dependency same for all render passes to be like Render pass B dependency setup.This one also works. But it requires from me recording more commands into CBs. So,which way is correct(given subpass dependency option is valid) and most optimal in terms of hardware efficiency?

score 2 · Accepted Answer · answered Apr 07 '20 at 14:42

The above solution seems to be working.

Your dependencies make no sense. The renderpass B dependency in particular is decidedly weird, given your description of the actual dependency: "where one of the draw calls samples from image which is render target of render pass A." That would represent a dependency between the framebuffer writes within A and the texture reads in B. Your dependency in your example creates a dependency between framebuffer writes within A and framebuffer writes within B. Which makes no sense and is unrelated to what you says you're actually doing.

Also, your srcAccessMask makes no sense, as you state that the prior source is reading memory, when you are trying to synchronize with something that is writing memory.

Now, it may be that your semaphore dependency happens to be covering it, or that the semaphore is interfering with Vulkan layers' attempts to detect dependency problems (you are using layers, right?). But the code you've shown simply doesn't make sense.

An external dependency on renderpass B is the right way to go (it's not clear why renderpass A needs an external dependency here), but it needs to actually make sense. If renderpass B is indeed sampling from an image written to by renderpass A, then it would look something like this:

    dependencies[0].srcSubpass = VK_SUBPASS_EXTERNAL;
    dependencies[0].dstSubpass = 0;
    dependencies[0].srcStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT;
    dependencies[0].dstStageMask = VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT; //Assuming you're reading the image in the fragment shader. Insert other shader stage(s) if otherwise.
    dependencies[0].srcAccessMask = VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT; //renderpass A wrote to the image as an attachment.
    dependencies[0].dstAccessMask = VK_ACCESS_SHADER_READ_BIT; //renderpass B is reading from the image in a shader.
    dependencies[0].dependencyFlags = 0; //By region dependencies make no sense here, since they're not part of the same renderpass.

Yes,I am using validation layers and they tell me nothing. So just to make sure I understood you - the above setup is for render pass B, right? Then I don't really understand what dependencies mean.because I was thinking that for example dstAccessMask means - how the results of THIS render pass will be accessed somewhere else. — Michael IV, Apr 07 '20 at 14:56
@MichaelIV: You're writing a dependency. That the dependency happens to be a subpass external dependency doesn't change what the values in a dependency *mean*; it merely changes where the dependency sits. A source external dependency sits between the world before the render pass and the destination subpass. But the *meaning* of the various flags doesn't change. The `srcAccessMask` specifies how the commands in the source scope, whatever that may be, have accessed the given memory. — Nicol Bolas, Apr 07 '20 at 15:02
VK_ACCESS_MEMORY_READ_BIT I inherited from a render pass demo which rendered to a swap chain. In multi-pass offscreen rendering it indeed doesn't make sense. — Michael IV, Apr 07 '20 at 15:06

Render target write/ shader read synchronization between different render passes

1 Answers1