Portable way to create heterogenous vertex data array

Question

It is very common in graphics programming to work with vertex formats. This is described, for example, here.

However, I am looking for a way to accomplish that which does not invoke undefined behavior (I'm mainly looking for C++ info, but C would be fine, too).

The common way to do it is like this: First, declare your vertex format as a struct.

struct Vertex {
    float x;
    float y;
    uint16_t someData;
    float etc;
};

Then, you create an array of these, fill them in, and send them to your graphics API (eg: OpenGL).

Vertex myVerts[100];
myVerts[0].x = 42.0;
// etc.

// when done, send the data along:
graphicsApi_CreateVertexBuffer(&myVerts[0], ...);

(Aside: I skipped the part where you tell the API what the format is; we'll just assume it knows).

However, the graphics API has no knowledge about your struct. It just wants a sequence of values in memory, something like:

|<---  first vertex   -->||<---  second vertex  -->| ...
[float][float][u16][float][float][float][u16][float] ...

And thanks to issues of packing and alignment, there is no guarantee that myVerts will be laid out that way in memory.

Of course, tons of code is written this way, and it works, despite not being portable.

But is there any portable way to do this that is not either

1. Inefficient
2. Awkward to write

?

This is basically a serialization problem. See also: Correct, portable way to interpret buffer as a struct

The main standards-compliant way I know of is to allocate your memory as char[]. Then, you just fill in all the bytes exactly how you want them laid out.

But to transform from the struct Vertex representation above to that char[] representation would require an extra copy (and a slow byte-by-byte one, at that). So that's inefficient.

Alternatively, you could write data into the char[] representation directly, but that's extremely awkward. It's much nicer to say verts[5].x = 3.0f than addressing into a byte array, writing a float as 4 bytes, etc.

Is there a good, portable way to do this?

No, of course not, since for example the binary representation of floating point numbers differs largely between platforms. Either restrict the platform portability or use marshalling/unmarshalling functions, which convert between the formats and operates on character level. — Ctx, Jan 22 '20 at 08:54
AFAIK, OpenGL API allows you to specify both the size of the vertex element and the offset and kind of each field in a structure. — Deedee Megadoodoo, Jan 22 '20 at 08:56
You could have the data in your struct in a `char[]` of the right size and expose some methods to read from/write to the different members you would want it to have. This would sadly involve a lot of boilerplate. But if you want to the struct to have a contiguous memory layout, to the best of my knowledge having an underlying `char[]` is the only way. — n314159, Jan 22 '20 at 08:58

score 1 · Accepted Answer · edited Jun 20 '20 at 09:12

However, the graphics API has no knowledge about your struct. It just wants a sequence of values in memory, something like:
|<---  first vertex   -->||<---  second vertex  -->| ...
[float][float][u16][float][float][float][u16][float] ...

This is not true. The graphics API has knowledge about your struct, because you told it about your struct. You even know this already:

(Aside: I skipped the part where you tell the API what the format is; we'll just assume it knows).

When you tell the graphics API where each field is in your struct, you should use sizeof and offsetof instead of guessing the layout of your struct. Then your code will work even if the compiler inserts padding. For example (in OpenGL):

struct Vertex {
    float position[2];
    uint16_t someData;
    float etc;
};

glVertexAttribPointer(position_index, 2, GL_FLOAT, GL_FALSE, sizeof(struct Vertex), (void*)offsetof(struct Vertex, position));
glVertexAttribIPointer(someData_index, 1, GL_SHORT, sizeof(struct Vertex), (void*)offsetof(struct Vertex, someData));
glVertexAttribPointer(etc_index, 1, GL_FLOAT, GL_FALSE, sizeof(struct Vertex), (void*)offsetof(struct Vertex, etc));

not

glVertexAttribPointer(position_index, 2, GL_FLOAT, GL_FALSE, 14, (void*)0);
glVertexAttribIPointer(someData_index, 1, GL_SHORT, 14, (void*)8);
glVertexAttribPointer(etc_index, 1, GL_FLOAT, GL_FALSE, 14, (void*)10);

Of course, if you were reading your vertices from disk as a blob of bytes in a known format (which could be different from the compiler's struct layout), then you may use a hardcoded layout to interpret those bytes. If you're treating vertices as an array of structs, then use the layout the compiler has decided for you.

Thank you, this is helpful. However, are you saying that it *is* portable/defined behavior to have an array of `struct Vertex`, and, using the information gleaned from `sizeof` and `offsetof`, then read out specific values from that array, interpreting them in certain ways? In other words: is this still undefined, but we're kicking the can over to OpenGL, which has presumably provided a platform-appropriate solution? That is: If I controlled both ends of this serialization, could I write portable code (eg: send on one machine, receive on another)? — jwd, Jan 22 '20 at 19:55
Not especially, but that's the tricky thing about UB — it happens by default, if it's not explicitly allowed by the standard. So, I'm just wondering if the approach we're discussing *is* explicitly supported. It could be Undefined Behavior, Unspecified Behavior, or Implementation-defined behavior (none of which are portable). Or, it could be that holy grail of "well-defined behavior", which is what I'm striving for (: — jwd, Jan 23 '20 at 22:27

Portable way to create heterogenous vertex data array

1 Answers1