I recently learned of Uniform Buffer Objects in OpenGL, but they rely heavily on understanding how data will be padded to get the proper offsets of elements. I have read some examples and parts of the std140 layout specification, but I could not understand the case of structs and arrays. Because I am not sure what exactly I am misunderstanding, I will go down through an example below in hopes someone can explain where I am failing. Less emphasis on space optimization, and more on correctness for now.
Say we want a struct representing some global scene data. The C++ declaration (which corresponds to the GLSL declaration) is as follows:
struct GlobalSceneData
{
glm::mat4 View; // Size: 64, Alignment = 4N
glm::mat4 Projection; // Size: 64
glm::mat3 NormalMatrix; // Size: 48
// -- Offset = 176
// bool Enabled(offset=176)
// bool Shadows(offset=180)
// PAD(4)
// PAD(4)
// vec3 Direction (offset=192, 12N)
// float Intensity (offset=204)
// vec3 Color (offset=208, 13N)
DirectionalLight Sunlight; // Size: 36 -> 48 (padded)
// -- Offset = 224
// bool Enabled(offset=224)
// bool Shadows(offset=228)
// PAD (4)
// PAD(4)
// vec3 Color (offset=240, 15*4N)
// float Intensity (offset=252)
// vec3 Position (offset=256, 16*4N)
// float C_Atten(offset=268)
// float L_Atten(offset=272)
// float Q_Atten(offset=276)
PointLight Lights[MAX_SCENE_LIGHTS]; // Size: 48 * MAX_SCENE_LIGHTS
// -- Offset = 280
};
I have annotated the structs based on what I expect the offsets to be. The DirectionalLight and PointLight structures have variables ordered as seen in the comments excluding the paddings.
In the case of matrices, I consider them column by column, each of which needs an alignment of 4N (16 bytes). Conveniently, the columns of the first three matrices will be tightly packed because the offset always ends up at a multiple of 4N
:
View[0] -> Offset=0, Size=16
View[1] -> Offset=16, Size=16
View[2] -> Offset=32, Size=16
and so on...
By the time we are done with the NormalMatrix, our current offset is 176
. Now it just so happens the next two entries are bools, which align as expected. Afterwards however, our offset is 180+4+4 = 188
which is NOT a multiple of 16 bytes for the upcoming vec3 to align with. Does that mean we pad until we get to the next multiple of 16? That would require 8 additional bytes to get to an offset of 192
.
Following the vec3 is a float, which results in an offset of 192 + 12 + 4 = 208
, which happens to be a multiple of 16, hence the subsequent vec3 field is packed tightly.
Now, I also read that the size of the structure will be padded such that it is rounded to the nearest multiple of 4N
. Is the size to be padded including the 4 byte paddings I artificially added? Is it (4 + 4 + 12 + 4 + 12 = 36)
or (4 + 4 + 4 + 4 + 12 + 4 + 12 = 44)
. Do note I did not include these paddings in the declaration of the struct, the annotation just shows what I expect GLSL to interpret.
The PointLights follow a similar procedure; I have done it for the first element in the array. Now what happens for the remaining elements of the array? Am I just repeating the process all over again? That means, we pick up at the offset the last element left us. In this case, from an offset of 280
, we get:
Lights[1].Enabled(offset=280)
Lights[1].Shadows(offset=284)
// -- Offset = 288, which is a multiple of 4N, no need to pad
Lights[1].Color(offset=288)
Lights[1].Intensity(offset=300)
// etc...
My end goal is to be able to tell OpenGL to set the uniform buffer by simple saying something like glBufferSubData(GL_UNIFORM_BUFFER, 0, sizeof(GlobalSceneData), (const void*) sceneData)
to bulk set the entire struct; but with the way alignments can change in between array elements, it appears very difficult to do. How do people usually deal with this?