Should I ever use a `vec3` inside of a uniform buffer or shader storage buffer object?

Question

The vec3 type is a very nice type. It only takes up 3 floats, and I have data that only needs 3 floats. And I want to use one in a structure in a UBO and/or SSBO:

layout(std140) uniform UBO
{
  vec4 data1;
  vec3 data2;
  float data3;
};

layout(std430) buffer SSBO
{
  vec4 data1;
  vec3 data2;
  float data3;
};

Then, in my C or C++ code, I can do this to create matching data structures:

struct UBO
{
  vector4 data1;
  vector3 data2;
  float data3;
};

struct SSBO
{
  vector4 data1;
  vector3 data2;
  float data3;
};

Is this a good idea?

Nicol Bolas · Accepted Answer · 2019-12-01T20:49:59.070

83

NO! Never do this!

When declaring UBOs/SSBOs, pretend that all 3-element vector types don't exist. This includes column-major matrices with 3 rows or row-major matrices with 3 columns. Pretend that the only types are scalars, 2, and 4 element vectors (and matrices). You will save yourself a very great deal of grief if you do so.

If you want the effect of a vec3 + a float, then you should pack it manually:

layout(std140) uniform UBO
{
  vec4 data1;
  vec4 data2and3;
};

Yes, you'll have to use data2and3.w to get the other value. Deal with it.

If you want arrays of vec3s, then make them arrays of vec4s. Same goes for matrices that use 3-element vectors. Just banish the entire concept of 3-element vectors from your SSBOs/UBOs; you'll be much better off in the long run.

There are two reasons why you should avoid vec3:

It won't do what C/C++ does

If you use std140 layout, then you will probably want to define data structures in C or C++ that match the definition in GLSL. That makes it easy to mix&match between the two. And std140 layout makes it at least possible to do this in most cases. But its layout rules don't match the usual layout rules for C and C++ compilers when it comes to vec3s.

Consider the following C++ definitions for a vec3 type:

struct vec3a { float a[3]; };
struct vec3f { float x, y, z; };

Both of these are perfectly legitimate types. The sizeof and layout of these types will match the size&layout that std140 requires. But it does not match the alignment behavior that std140 imposes.

Consider this:

//GLSL
layout(std140) uniform Block
{
    vec3 a;
    vec3 b;
} block;

//C++
struct Block_a
{
    vec3a a;
    vec3a b;
};

struct Block_f
{
    vec3f a;
    vec3f b;
};

On most C++ compilers, sizeof for both Block_a and Block_f will be 24. Which means that the offsetof b will be 12.

In std140 layout however, vec3 is always aligned to 4 words. And therefore, Block.b will have an offset of 16.

Now, you could try to fix that by using C++11's alignas functionality (or C11's similar _Alignas feature):

struct alignas(16) vec3a_16 { float a[3]; };
struct alignas(16) vec3f_16 { float x, y, z; };

struct Block_a
{
    vec3a_16 a;
    vec3a_16 b;
};

struct Block_f
{
    vec3f_16 a;
    vec3f_16 b;
};

If the compiler supports 16-byte alignment, this will work. Or at least, it will work in the case of Block_a and Block_f.

But it won't work in this case:

//GLSL
layout(std140) Block2
{
    vec3 a;
    float b;
} block2;

//C++
struct Block2_a
{
    vec3a_16 a;
    float b;
};

struct Block2_f
{
    vec3f_16 a;
    float b;
};

By the rules of std140, each vec3 must start on a 16-byte boundary. But vec3 does not consume 16 bytes of storage; it only consumes 12. And since float can start on a 4-byte boundary, a vec3 followed by a float will take up 16 bytes.

But the rules of C++ alignment don't allow such a thing. If a type is aligned to an X byte boundary, then using that type will consume a multiple of X bytes.

So matching std140's layout requires that you pick a type based on exactly where it is used. If it's followed by a float, you have to use vec3a; if it's followed by some type that is more than 4 byte aligned, you have to use vec3a_16.

Or you can just not use vec3s in your shaders and avoid all this added complexity.

Note that an alignas(8)-based vec2 will not have this problem. Nor will C/C++ structs&arrays using the proper alignment specifier (though arrays of smaller types have their own issues). This problem only occurs when using a naked vec3.

Implementation support is fuzzy

Even if you do everything right, implementations have been known to incorrectly implement vec3's oddball layout rules. Some implementations effectively impose C++ alignment rules to GLSL. So if you use a vec3, it treats it like C++ would treat a 16-byte aligned type. On these implementations, a vec3 followed by a float will work like a vec4 followed by a float.

Yes, it's the implementers' fault. But since you can't fix the implementation, you have to work around it. And the most reasonable way to do that is to just avoid vec3 altogether.

Note that, for Vulkan (and OpenGL using SPIR-V), the SDK's GLSL compiler gets this right, so you don't need to be worried about it for that.

edited Dec 01 '19 at 20:49

answered Jul 03 '16 at 17:46

Nicol Bolas

449,505
63
781
982

You can do your layout manually in glsl though (and will be layed out explicitly in the spir-V that the glsl gets compiled to) – ratchet freak Jul 03 '16 at 21:07
@ratchetfreak: Yes, with `offset` and `align`, you can do your own layout. What you *cannot* do is violate the basic layout rules; the GLSL specification is semi-clear on that. `offset` cannot place a variable "within the previous member of the block". Now, what makes this "semi-clear" is what "within" means. Does a member take up its base alignment's worth of space, or does it only take up just its own space? With Vulkan, things are equally unclear, as the SPIR-V layout rules say that objects can't overlap (kinda), but again nothing is said about base alignment. – Nicol Bolas Jul 03 '16 at 23:39
1

Well, I recently reverse-engineered it and at least `glslangValidator` packs that tightly in SPIR-V as expected (member offsets 0, 16, 28 respectively). If it's only about alignment, then better example would be vec3 following scalar float. – krOoze Jul 04 '16 at 16:16
@krOoze: "*at least glslangValidator packs that tightly in SPIR-V as expected (member offsets 0, 16, 28 respectively).*" Well that's a bug. KHR_vulkan_glsl does not change the std140 layout rules. So unless you explicitly declare offsets, it *must* do what the OpenGL standard says. – Nicol Bolas Jul 04 '16 at 16:31
1

@NicolBolas Well, there is ambiguity (which I think you would know, since I believe you started GitHub Issue about it). It's more clearly in OGL spec than VK (since it at least defines most of the terms used) I would lean that "basic machine unit consumed by the previous member" there still means 3N B for `vec3` ("alignment" should not change that - that's not what the word means). – krOoze Jul 04 '16 at 16:42
1

^ Actually notbug'ed here I think: https://github.com/KhronosGroup/glslang/issues/201 – krOoze Jul 04 '16 at 17:15
2

@krOoze: I've looked over all of this and discovered that you are right. So I've changed the reasoning for avoiding them. – Nicol Bolas Jul 10 '16 at 14:49
@NicolBolas, any chance we can get an extension to this post, related to `std430`? – AzP Dec 01 '17 at 11:08
4

@AzP: The *only* thing `std430` changes about the layout rules is the base alignment for arrays and structs of scalars and two-element vectors. It changes nothing about `vec3`s, since the base alignment of them is always that of a `vec4`. – Nicol Bolas Dec 01 '17 at 14:38
3

Important thing I learned reading this: The spec allows a `float` to immediately follow a `vec3` in memory, so that they both take up a total of 16 bytes. This appears to be the main (if not only) distinguishing feature of a `vec3` vs a `vec4`, in terms of alignment. I.e. size != base alignment only in case of vectors with three components that also aren't array elements. – Philip Guin May 04 '18 at 00:10
1

Doesn't vec3 + float work in this circumstance in std140, [example](https://stackoverflow.com/a/19957102/2036035)? Alignment might be 16, but the float is still offset 12 bytes right? – Krupip Feb 18 '19 at 18:47
@opa: But it won't work for `vec3 + vec3`. My point is that no *single type definition* will result in *any* GLSL struct having the same layout as the C++ equivalent. The goal is to not have to remember the rules, just to copy the GLSL struct and to change the member types to C++ named types. – Nicol Bolas Feb 18 '19 at 19:27
1

Okay, I confused your two `vec3` examples and `vec3 + float` example my bad. – Krupip Feb 18 '19 at 19:30
@NicolBolas You mentioned, that std430 "changes nothing about vec3s, since the base alignment of them is always that of a vec4". In the chosen answer of another question (https://stackoverflow.com/questions/29531237/memory-allocation-with-std430-qualifier) I read, that in case of std430 "three-component vectors are not rounded up to the size of four-component vectors". I am confused. Are any of the above statements wrong? What am I understanding wrong? – Fox1942 May 15 '20 at 09:57
3

@Tom: All of the examples in that answer are correct, but the text quoted is not from the OpenGL specification; it's from a book. And that quote is incorrect. The specification states that, for std430, the rounding rules for arrays and structs don't apply. But the vec3 issue is *not* from rounding rules; it's alignment is not based on rounding for arrays and structs. The base alignment of a vec3 is directly specified to be 16 bytes. – Nicol Bolas May 15 '20 at 13:28
Thank you for the informative post here. So little info on vulkan its good to see those with dos and donts – Kalen Mar 13 '22 at 21:18
@NicolBolas Thank you for the answer , just wanted to check Is it still valid today also ? – Summit Aug 04 '22 at 11:23
@Summit: Which part? The driver bug issues are probably not relevant, but the behavior hasn't changed. – Nicol Bolas Aug 04 '22 at 13:20
@Nicol Bolas Can we use vec3 in our uniform buffer objects ? – Summit Aug 04 '22 at 13:39
1

@Summit: This answer gives reasons why you *shouldn't* use them, not why you *can't*. If you're very careful with your object layouts and are always mindful of the rules, both in GLSL and in C, then you can get away with it. But it's not worthwhile. – Nicol Bolas Aug 04 '22 at 13:47
**and OpenGL using SPIR-V** Can you confirm this? I use shaderc to generate SPIR-V binary for my shader, but still I encounter this issue. – xubury Sep 07 '22 at 09:08
@xubury: Which issue precisely do you encounter? The issue that GL+SPIR-V fixes is the problem with variance in shader compiler implementations. That is, some compilers got layouts wrong at various points, while the SPIR-V version lays things out explicitly. Implementing `vec3` correctly by the standard *still* leads to the primary reason I suggested avoiding them (difficulty making layouts compatible with C). – Nicol Bolas Sep 07 '22 at 13:22
@NicolBolas Ok, I thought what you meant is that `vec3`'s 16 bytes alignment will behave like C/C++ when using spir-v shader in opengl. So the only way I can use `vec3` safely is to switch to vulkan? – xubury Sep 07 '22 at 14:09
@xubury: Unextended Vulkan uses the same `std140/430` layout rules as OpenGL, so that doesn't fix that problem. – Nicol Bolas Sep 07 '22 at 14:30
I can't confirm that in C++ , . Here is your example of vec3 followed by a float: https://godbolt.org/z/oEsP6x449. The resulting type occupies 16 bytes in C++. – facetus Feb 28 '23 at 23:17
@facetus: In your example, `vec3` is not aligned to 16 bytes; it is only the member `A::v` that is aligned to 16. Types and member variables aren't the same thing. If you align the actual type, [you get behavior as I described](https://godbolt.org/z/Mq79P4Gcr). – Nicol Bolas Feb 28 '23 at 23:25
Got it. So, with aligning struct members, we can achieve the same layout as std140. I don't see why one shouldn't use vec3 then. `alignas` should be used with C++ uniform equivalents regardless of vec3. – facetus Feb 28 '23 at 23:40
@facetus: "*I don't see why one shouldn't use vec3 then.*" Because you might *forget*. The thing about using them on types is that specifying it on the type counts for *all* uses of that type. If you use them on variables, you can forget to use them on *all* variables. You can use `vec3`s if you follow a bunch of esoteric rules on every use of those types in data structures meant for UBO/SSBOs, and if you miss one your code doesn't work for some strange reason. Or... you could just not use `vec3`s and not worry about it. I prefer code that is hard to break. – Nicol Bolas Mar 01 '23 at 01:21
Even if you avoid vec3 and use alignment on types, your code will break, if you are forgetful because types change alignment in std140 depending on whether they are in an array or not. For example, vec2 is 8 when alone, and 16 when in an array. If you define alignment on types, suddenly, it's much easier now to forget something because to follow you now have to check all ten types you used in a uniform spread across 8 header files. – facetus Mar 01 '23 at 16:16
@facetus: "*Even if you avoid vec3 and use alignment on types, your code will break, if you are forgetful because types change alignment in std140 depending on whether they are in an array or not.*" But there is no C++ mechanism to fix that. If you want to have an array, you have to make sure that the type you use is inherently 16-byte aligned. There's nothing you can do about that, so you have to remember the rules. By contrast, there's something you can do about `vec3`'s problems: do not use it. There's nothing you can do with a `vec3` that you can't do with a `vec4`. – Nicol Bolas Mar 01 '23 at 17:43
Yes, exactly, regardless of whether you use vec3 or not. I don't see a problem with vec3. – facetus Mar 02 '23 at 00:36
@facetus: Just look at how many duplicates of this question are linked here. Each one represents one person who got this wrong. Now find all the questions of people misusing arrays. There will be far fewer. Not only that, as I said, ditching `vec3` does not mean losing any expressive power. Whereas ditching arrays loses expressive power. That is, the complexity gains you something in the array case, whereas it gains you nothing in the `vec3` case. But if you feel that making your code more error-prone for zero gain is a worthwhile tradeoff, I can't stop you. – Nicol Bolas Mar 02 '23 at 01:08
Yes, this is unfortunate, people refer to this question without any critical thinking. I am lucky, I am capable of making my own decisions, regardless of what someone on stackoverflow thinks. – facetus Mar 02 '23 at 20:27

Should I ever use a `vec3` inside of a uniform buffer or shader storage buffer object?

1 Answers1

It won't do what C/C++ does

Implementation support is fuzzy

Linked

Related