5

I'm interested in implementing a particular algorithm in a set of Vulkan compute shaders. The algorithm uses a clz() function at one point. I expect that my NVIDIA GPU probably offers hardware support for this function; CUDA uses a clz instruction apparently, and clz() is in OpenCL 1.2 as well. So I don't want to write my own clz(). Is there any way for me to call the function in the way CUDA or OpenCL would do?

I suppose I could try compiling an OpenCL kernel to SPIR-V and using that in Vulkan, but I don't suppose Vulkan would be very happy about that...?

Another thought I've had is that maybe I could translate a very simple OpenCL kernel containing a clz() call to SPIR-V assembly, do the same with my GLSL shader, and then manually hack the clz() call, as it appears in the kernel assembly code, into the shader's assembly code. But I don't really know anything about the details of SPIR-V, or about any limits Vulkan may place on what sorts of SPIR-V instructions a compute shader may use, so I have hardly any idea about whether that could actually work.

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
mjwach
  • 1,174
  • 2
  • 9
  • 25

1 Answers1

8

Vulkan-bound SPIR-V has access to the GLSL extended instruction set, which includes the function FindUMSB, which finds the most-significant bit. You can use that to emulate clz by doing 31 - FindUMSB. It's possible, if the hardware has an explicit clz instruction, that the compiler can factor out the subtraction and replace the expression with the internal clz.

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
  • Note that a similar function in HLSL, _[firstbithigh](https://msdn.microsoft.com/en-us/library/windows/desktop/ff471400(v=vs.85).aspx)_, is a core function in dx11. I suppose the compiler can be opaque about it and just emulate it if the device doesn't support it. You can likely do the same. – Quinchilion Aug 20 '16 at 10:26
  • @Quinchilion: Yeah, it turns out that GLSL has that function too. – Nicol Bolas Aug 20 '16 at 13:21
  • 1
    Specifically, I'm seeing a [findMSB](https://www.opengl.org/sdk/docs/man/html/findMSB.xhtml) function in GLSL, which I assume transforms (during compilation from GLSL to SPIR-V) into either FindUMSB or FindSMSB according to the type of the argument. In my case, that's close enough to clz to be satisfying. Thanks! – mjwach Aug 20 '16 at 15:28