I came across this article that states there are no differences in performance between atomic counter buffers and an atomic variable in an SSBO:
Is this actually true across nvidia and AMD GPU's now? I think I remember something about Radeon 5870 generation GPU's having specific faster support for the atomic counter subset? So I think it may have been an AMD specific thing at one point for performance?
From knowledge of nvidia CUDA I suspect it's never made a difference for them?
Does anyone know after which generation of GPU's from AMD/NVidia atomic counters are not worth it?