In DX12 what Ordering Guarantees do multiple ExecuteCommandLists calls provide?

Question

Assuming a single threaded application. If you call ExecuteCommandLists twice (A and B). Is A guaranteed to execute all of its commands on the GPU before starting any of the commands from B? The closest thing I can find in the documentation is this, but it doesn't really seem to guarantee A finishes before B starts:

Applications can submit command lists to any command queue from multiple threads. The runtime will perform the work of serializing these requests in the order of submission.

As a point of comparison, I know that this is explicitly not guarenteed in Vulkan:

vkQueueSubmit is a queue submission command, with each batch defined by an element of pSubmits as an instance of the VkSubmitInfo structure. Batches begin execution in the order they appear in pSubmits, but may complete out of order.

However, I'm not sure if DX12 works the same way.

Frank Luna's book says:

The command lists are executed in order starting with the first array element

However in that context he's talking about calling ExecuteCommandLists once with two command lists (C and D). Do these operate the same as two individual calls? My colleague argues that this still only guarantees that they are started in order, not that C finishes before D starts.

Is there more clear documentation somewhere I'm missing?

Lockyer · Accepted Answer · 2018-09-19T13:02:44.167

I asked the same question in the Direct X forums, here's an answer from Microsoft engineer Jesse Natalie:

Calling ExecuteCommandLists twice guarantees that the first workload (A) finishes before the second workload (B). Calling ExecuteCommandLists with two command lists allows the driver to merge the two command lists such that the second command list (D) may begin executing work before all work from the first (C) has finished.

Specifically, the application is allowed to insert a fence signal or wait between A and B, and the driver has no visibility into this, so the driver must ensure that everything in A is complete before the fence operation. There is no such opportunity in a single call to the API, so the driver can optimize that scenario.

Source: http://forums.directxtech.com/index.php?topic=5975.0

Tom Huntington · Answer 2 · 2022-09-21T06:11:25.763

Finally the ID3D12CommandQueue is a first-in first-out queue, that stores the correct order of the command lists for submission to the GPU. Only when one command list has completed execution on the GPU, will the next command list from the queue be submitted by the driver.

https://learn.microsoft.com/en-us/windows/win32/direct3d12/porting-from-direct3d-11-to-direct3d-12

This isn't correct. I believe DirectX12 is the same as Vulkan

The specification states that commands start execution in-order, but complete out-of-order. Don’t get confused by this. The fact that commands start in-order is simply convenient language to make the spec language easier to write. Unless you add synchronization yourself, all commands in a queue execute out of order

I've just ran into this again. Command list A is not guaranteed to complete before command list B starts. And this creates race conditions

A writes    A reads
            
────────────────────
                  
      B writes     B reads

Edit: It turns out I was doing something stupid (calling CopyTextureRegion on two buffers) and this was casing a stall (which I could see in pix) and so my work for my next frame was started during this stall resulting in a race condition sometimes. Usually the commands for one frame complete before the next start, and if they don't you will see a gap in PIX where no work is happening for the currently view frame's timings.

In DX12 what Ordering Guarantees do multiple ExecuteCommandLists calls provide?

2 Answers2

Linked