0

I'm trying to make a soft body physics simulation, using OpenGL compute shaders. I'm using a spring/mass model, where objects are modeled as being made out of a mesh of particles, connected by springs (Wikipedia link with more details). My plan is to have a big SSBO that stores the positions, velocities, and net forces for each particle. I'll have a compute shader that, for each spring, calculates the force between the particles on both ends of that spring (using Hook's law) and adds that to the net forces for those two particles. Then I'll have another compute shader that, for each particle, does some sort of Euler integration using the data from the SSBO, and then zeros the net forces for the next frame.

My problem is with memory synchronization. Each particle is attached to more than one spring, so in the first compute shader, different invocations will be adding to the same location in memory (the one holding the net force). The spring calculations don't use data from that variable, so the writes can take place in whatever order, but I'm unfamiliar with how OpenGL memory works, and I'm not sure how to avoid race conditions. In addition, from what I've read it seems like I'll need glMemoryBarrier(GL_SHADER_STORAGE_BARRIER_BIT) between the calls to the spring and particle compute shaders, so that the data written by the former is visible to latter. Is this necessary/sufficient?

I'm afraid to experiment, because I'm not sure what protections OpenGL gives against undefined behavior and accidentally screwing up your computer.

  • 1
    "*accidentally screwing up your computer.*" The absolute most that might happen is that you'll have to reboot your computer. – Nicol Bolas Nov 02 '22 at 01:42

1 Answers1

1

different invocations will be adding to the same location in memory (the one holding the net force). The spring calculations don't use data from that variable, so the writes can take place in whatever order

In this case you would need to use atomicAdd() in GLSL to make sure two separate threads don't get into a race condition.

In your case I don't think this will be a performance issue, but you should be aware that atomicAdd() can cause a big slowdown in cases where many threads are hitting the same location in memory at the same time (they have to serialize and wait for eachother). This performance issue is called "contention", and depending on the problem, you can usually improve it a lot by using warp-level primitives to make sure only 1 thread within each warp needs to actually commit the atomicAdd() (or other atomic operation).

Also "warps" are Nvidia terminology, AMD calls them "wavefronts", and there are different names still on other hardware vendors and API's.

In addition, from what I've read it seems like I'll need glMemoryBarrier(GL_SHADER_STORAGE_BARRIER_BIT)

This is correct. Conceptually the way I think about it is that OpenGL compute shaders are async by default. This means, when you launch a compute shader, there's no guarantee when it will execute relative to subsequent commands. glMemoryBarrier(GL_SHADER_STORAGE_BARRIER_BIT) will basically create a wait() between any draw/compute commands accessing that type of resource.

Ubler
  • 11
  • 1