5

It appears at one point in time Nvidia had an extension that permitted half floating point values for OpenGL 1.1, but apparently since that time *the world half has been reclaimed by the modern GLSL spec at some point.

Today I can use 16bit floating point values in CUDA no problem, there should not be an issue in hardware for NVIDIA to support 16bit floats, and they appear to support them in HLSL, and heck even seem to contradictory support them in HLSL cross compilation to SPIR-V while GLSL does not for Nvidia. It seems like SPIR-V has all the primitives needed to support 16bit floating point regardless with primary extensions (KHR) so there doesn't seem to be a reason why it should forbid me from using them.

I'm unsure why, despite having an Nvidia card, I can't take advantage of 16bit floating point arithmetic, and am apparently forced to use AMD or switch API's entirely if I want to take advantage of that. Surely there must be some way to actually use true 16bit floating point values for both?

I am NOT asking about host to device allocated buffers (IE vertex buffers). Yes, you can allocate those as 16bit floats with a KHR extension and not have to much of an issue, but inside the actual shader, using 16bit floats, and not 16bit floats coerced to 32 bit floats is what I'm worried about.

Krupip
  • 4,404
  • 2
  • 32
  • 54
  • I assume you are aware that the throughput for FP16 operations on all Pascal-family [consumer cards](https://en.wikipedia.org/wiki/GeForce_10_series) is so low that you would be *much* better of reading from an FP16 buffer but computing in FP32? – njuffa Apr 17 '18 at 22:02
  • @njuffa wow, I didn't realize that Nvidia wasn't doing much on the consumer end of fp16 performance, they've been advertising that "fp16 will increase your performance!" in cuda since cuda 7.0, those stats are odd though, considering PTX would JIT fp16 to use fp16 units *and* fp32 units. However looking at the cuda 8.0 mixed precisoin, the specifically mention P series instructions, rather than instructions available to all gpus. – Krupip Apr 17 '18 at 22:19
  • @njuffa in Volta however, seems like tensor cores will be on consumer GPUs as well, otherwise their Neural network denoise filter won't be relevant at all. Tensor cores are 4x4x4 16bit float matrix multiply units. – Krupip Apr 17 '18 at 22:20
  • 1
    The blog post you linked specifically talks about P100, the only Pascal-family GPU with high throughput for FP16 operations. P100 is not the basis of any consumer parts, it is found only in high-end Quadro and Tesla parts. At this point nobody knows what Volta consumer parts will look like, or whether there will actually be Volta consumer parts (some rumors state that consumer parts will use a different architecture). IMHO, FP16 is very useful as a *storage* format that reduces bandwidth requirements, but not very useful for *computation* across the huge universe of NVIDIA *consumer* GPUs. – njuffa Apr 17 '18 at 22:36

1 Answers1

6

VK_KHR_shader_float16_int8 exposes FP16 capabilities through both SPIR-V and Vulkan (as well as 8-bit integers) within shaders. This extension was promoted to core (as an optional feature) in Vulkan 1.2. This capability only enables computations within a shader, not the use of 16-bit floats in shader interfaces (vertex shader inputs, UBOs, etc).

SPV_AMD_gpu_shader_half_float exposed the Float16 capability to SPIR-V, but the corresponding Vulkan extension VK_AMD_gpu_shader_half_float did not actually enable a similar capability in Vulkan. So you could not really use it. This was eventually fixed.

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
  • Ok I'm confused about the "Of course, the issue is that Vulkan itself does not offer the Float16 capability" part, in combination with float16 storage, and the AMD extension, how is this the case? It also seems that the link has float16 support as well. – Krupip Apr 17 '18 at 23:00
  • Yes, the *compiler* glslang can output SPIR-V that has the Float16 capability. But the Vulkan specification does not offer any feature or extension that *exposes* that capability. So no Vulkan implementation can *consume* shaders that use `Float16`. Yes, it exposes features for 16-bit float *storage*, but Appendix A *never* mentions `Float16` as a feature. It has features like `StorageBuffer16BitAccess` or `StoragePushConstant16`, but not `Float16` itself. 16-bit storage only applies to getting and setting data, not to internal shader processing of data. – Nicol Bolas Apr 18 '18 at 00:21
  • but doesn't that link you put there literally show extensions for half float shaders...? – Krupip Apr 18 '18 at 01:02
  • @snb: That link is for ***the compiler***, not for your Vulkan implementation that takes the compiler's output. The compiler can spit out code that uses `Float16` all day, but if the Vulkan implementation cannot take it (and the specification doesn't allow them to), then it won't matter. The SPIR-V shader uses a capability that the Vulkan implementation does not provide, and that is a Valid Usage violation in module compilation, so you get undefined behavior. – Nicol Bolas Apr 18 '18 at 01:39
  • but the standard with extensions appears to allow this? https://www.khronos.org/registry/vulkan/specs/1.1-extensions/html/vkspec.html#VK_AMD_gpu_shader_half_float – Krupip Apr 18 '18 at 13:27
  • @snb: That Vulkan extension does one thing: it allows the SPIR-V extension to work. And if you look at that SPIR-V extension, it *only* adds a bunch of float-16 overloads of library functions. It doesn't actually turn on the `Float16` capability in the Vulkan implementation's SPIR-V. Really, just search the spec for "Float16"; you won't find it. And without that specifically listed, it won't be part of *any* Vulkan implementation's SPIR-V. – Nicol Bolas Apr 18 '18 at 13:30
  • oh weird, I got it confused with the GLSL extension. Can you add this information in your post so people don't have to dig through the comments? – Krupip Apr 18 '18 at 13:34
  • hot damn you did me one better and actually made an issue! – Krupip Apr 18 '18 at 13:46
  • This post is getting older. fp16 is available under vulkan now. There are a number of features that need to be tested for/and set. Search for "16bit" in vulkan_core.h to get a good over view of the support. here are the most interesting structures VkPhysicalDevice16BitStorageFeatures, uniformAndStorageBuffer16BitAccess storagePushConstant16 also the type to use if supported is f16vec3, etc... (hopefully I finally got this right as this is my 5th edit!) – pmw1234 Dec 13 '22 at 12:44
  • @pmw1234: The OP specifically stated that they were not asking about using 16 bit values in interface types. I've updated the post to reflect the current state of using 16-bit values for in-shader operations. – Nicol Bolas Dec 13 '22 at 14:58
  • @NicolBolas This is just one of the posts that comes up when looking into fp16 on vulkan so I just wanted to keep it up to date, that's all. – pmw1234 Dec 13 '22 at 15:02