5

Possible Duplicate:
Is the “struct hack” technically undefined behavior?

I checked if zero length arrays were allowed in C++11. It appeared they aren't. From 8.3.4 Arrays [dcl.array]

If the constant-expression (5.19) is present, it shall be an integral constant expression and its value shall be greater than zero.

Since i cant use zero length arrays Is it possible to use variable length structs while being standard/Well Defined? For example I'd want to do something like the below. How do I make it well defined and standard when the buffer MAY BE EMPTY.

-edit- related: Array of zero length

struct MyStruct {
    uint size;
    int32 buf[0];//<-- NonStandard!
};
...
auto len=GetLength();
auto ptr=GetPtr();
auto bytelen=len*sizeof(int32);
var p = reinterpret_cast<MyStruct*>(malloc(bytelen))
p->size=len
memcpy(p->buf, ptr, bytelen)
return p;
Community
  • 1
  • 1

3 Answers3

7

This is C++, not C. You don't need this flexible array member hack in C++, because you can easily make a template class which can endow any struct with a flexible array past the end and encapsulate the pointer arithmetic calculation and the memory allocation to make it work. Watch:

#include <cstring>

template <typename STRUCT, typename TYPE> class flex_struct {
public:
  TYPE *tail()
  {
    return (TYPE *) ((char *) this + padded_size());
  }

  // substitute malloc/free here for new[]/delete[] if you want
  void *operator new(size_t size, size_t tail)
  {
    size_t total = padded_size() + sizeof (TYPE) * tail;
    return new char[total];
  }

  void operator delete(void *mem)
  {
    delete [] (char *) mem;
  }
private:
  static size_t padded_size() {
    size_t padded = sizeof (flex_struct<STRUCT, TYPE>);
    if(padded % alignof(TYPE) != 0) {
         padded = padded & ~(alignof(TYPE)-1) + alignof(TYPE);
    }
    return padded;
  }
};

struct mystruct : public flex_struct<mystruct, char> {
  int regular_member;
};

int main()
{
  mystruct *s = new (100) mystruct; // mystruct with 100 chars extra
  char *ptr = s->tail();            // get pointer to those 100 chars
  memset(ptr, 0, 100);              // fill them
  delete s;                         // blow off struct and 100 chars
}
R. Martinho Fernandes
  • 228,013
  • 71
  • 433
  • 510
Kaz
  • 55,781
  • 9
  • 100
  • 149
  • I love this answer. But for some reason I don't believe this will work. I'l going to write code to figure the problems out. +1 –  Mar 05 '12 at 21:32
  • It may not work in the sense that it doesn't meet someone's specific requirements, but I compiled the code with g++ -Wall -ansi, and ran the ./a.out under valgrind: no errors. – Kaz Mar 05 '12 at 21:51
  • I can't figure out any other way to write the code you written. The biggest problem is alignment. Lets say your members are 4 or 6 bytes and the array type is long which requires 8byte alignment. It would be wrong. Also the new is incorrect as you have 100extra bytes and dont do sizeof on the regular members BUT I know how to fix that and your actual point. –  Mar 05 '12 at 21:52
  • Valgrind didnt report an error!?! But you did memset on ptr for 100 bytes and ptr should be +4bytes after s which means you did a 4byte overrun :/. How can that be correct!?! –  Mar 05 '12 at 21:53
  • 1
    This is neither standard-compliant nor portable. Consider `struct s : public flex_struct { char c; }`. If the struct is allocated at the aligned address `X`, then the `int` array starts at the unaligned address `X+1`. Bad things can happen with unaligned pointers. – Robᵩ Mar 05 '12 at 21:55
  • The template class calculates the size of the struct. That's what the expression sizeof(flex_struct) is for. So 104 bytes are allocated in the operator new: 4 bytes for the struct, 100 for the requested extra array (100 * sizeof (TYPE)) where TYPE was specified as char in the template arguments. You're right about alignment; the calculation needs to be a little more complicated to handle that 100% right for all combinations of types. – Kaz Mar 05 '12 at 22:00
  • Bad things either happen with unaligned pointers (CPU exception) or they don't. It's only nonportable when you actually have a combination that leads to alignment requirements not being met. For instance if the first member is int and the array element type is short, there is no problem. The next byte after the struct is aligned for type int, and the alignment requirement of short is less strict. I think this issue is easy enough to fix; not willing to hack on this any more. – Kaz Mar 05 '12 at 22:01
  • You'll need C++11 alignment support, but it can now be fixed portably. – MSalters Mar 06 '12 at 09:26
  • `tail` can be made substantially shorter (while losing the über-ugly C-style cast): `return reinterpret_cast(this + 1);` – Konrad Rudolph Mar 06 '12 at 13:34
  • @KonradRudolph you may want to see http://stackoverflow.com/questions/9584641/is-this-code-legal-portable-accesing-data-outside-of-struct –  Mar 06 '12 at 13:46
  • @Kaz Heres what i worked out based on your example http://stackoverflow.com/questions/9584641/is-this-code-legal-portable-accesing-data-outside-of-struct –  Mar 06 '12 at 13:47
  • 1
    I gave it a try at fixing the alignment issue. I suppose it could be done in C++03 too, with `boost::alignment_of`. – R. Martinho Fernandes Mar 06 '12 at 14:17
  • What is über-ugly, reintrepret_cast(expr) or (foo *) expr? It's in the eye of the beholder. The C style cast is shorter and more convenient, and equivalent to the appropriate C++ style cast that fits the situation. The downside is that the machine is choosing which C++ cast that is, and you are not coding a constraint. If the code changes so that the conversion is inappropriate, you don't get a warning. Usually this is only an academic/theoretical problem that doesn't justify the inconvenience of blabber_blab(sproink). – Kaz Mar 07 '12 at 21:38
6

No, you cannot do it compliantly *.

Use a std::vector.

* I'm assuming that C++ doesn't add any rules that contradict C in this area. IMO it's highly unlikely, though I don't have time to verify that at the minute.

Community
  • 1
  • 1
Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
  • Ok i undid my DV BUT the question is about how to define the struct. I have no questions about how to allocate the memory. For example the pointer may be pointed to an buf used in fread and what if the last 4bytes is the `size`. Than i believe `buf[1]` is wrong/illegal? –  Mar 05 '12 at 21:01
  • While “use a `std::vector`” is the best advice in the general case, the variable-length `struct` hack is for when you want to eliminate indirections as much as possible, so `vector` may not be the best option. – Jon Purdy Mar 05 '12 at 21:08
  • @JonPurdy: It's the only option that satisfies the question's stated requirements of standard compliance and defined behaviour, short of statically-allocating a vast array that you only use part of. – Lightness Races in Orbit Mar 05 '12 at 21:09
  • But the answer has nothing to do with my question. You didnt say if I may define buf as 0 length or even 1 length nor make mentions of alignment and such. Which is why i originally DV it –  Mar 05 '12 at 21:12
  • 3
    @acidzombie24: It has everything to do with your question. You asked whether you can do either compliantly, to which the answer is "no". – Lightness Races in Orbit Mar 05 '12 at 21:13
  • Good point. So i can't use that struct at all hmmm... Ok i'll +1 it –  Mar 05 '12 at 21:21
  • You may want to see http://stackoverflow.com/questions/9584641/is-this-code-legal-portable-accesing-data-outside-of-struct –  Mar 06 '12 at 13:45
1

The struct hack was never standard. This should be a standard viable replacement:

struct MyStruct {
    uint size;
    int32 buf[1];
};
BЈовић
  • 62,405
  • 41
  • 173
  • 273
  • 1
    It'll still be UB to access beyond the end of `buf`, allocated memory or not, no? – Lightness Races in Orbit Mar 05 '12 at 20:53
  • @LightnessRacesinOrbit No, I think it is not UB. I hope I am not wrong ;) – BЈовић Mar 05 '12 at 20:55
  • What happens when size is 0 and p->buf[0] is out of bounds/not allocated? is it still legal? –  Mar 05 '12 at 20:55
  • I think the question here is whether or not it's UB to access allocated memory through an object that semantically does not extend into that memory. I say yes, but I don't have time to check/research why at the minute. – Lightness Races in Orbit Mar 05 '12 at 20:57
  • 2
    I just found [this](http://stackoverflow.com/questions/3711233/is-the-struct-hack-technically-undefined-behavior/). Apparently it is ub in C, therefore most likely in c++ as well – BЈовић Mar 05 '12 at 20:57
  • The question is really if there is NO MEMORY allocated for buff (For example the pointer may be pointed to an buf used in fread and what if the last 4bytes is the size. Than i believe buf[1] is wrong/illegal?). If the struct decl UB since buf isnt 1 byte (although you say it does) –  Mar 05 '12 at 21:03
  • @VJovic: That makes me wonder what pointer math at all produces clearly defined behavior, since array indexing is just pretty pointer math. if `p->buf[99]` is UB, is `*(p->buf + 99)` UB? – Drew Dormann Mar 05 '12 at 21:04
  • @Drew yes. The first one is defined in terms of the second. – R. Martinho Fernandes Mar 05 '12 at 21:09
  • @DrewDormann: Correct; advancing a pointer past the object it points to is UB. I wonder whether, strictly speaking, that makes `char* x = new char[50]; x++;` UB though. Since you have `char*` not `char(*)[50]`. I'd guess there's an allowance for `new[]` – Lightness Races in Orbit Mar 05 '12 at 21:11
  • In the case of `char* x = new char[50]` `x` is pointing to an array of size 50, just as if you said `char a[50]; char *x = a;` so you can increment the pointer within that range and not be advancing the pointer past the object it points to. With the struct hack an array with a defined size is decaying to a pointer, and then that pointer is advanced outside the legal range. – bames53 Mar 05 '12 at 21:20
  • @LightnessRacesinOrbit `x` is a pointer to the first element of an array. You can move that pointer around as long as you keep it inside that array. – R. Martinho Fernandes Mar 05 '12 at 21:22
  • @R.MartinhoFernandes: But according to the type, it's a pointer to a `char`. One `char`. That you can in practice increment it to get another `char` may not be relevant. Otherwise I might as well get a pointer to my washing machine and call it well-defined. – Lightness Races in Orbit Mar 05 '12 at 21:35
  • @LightnessRacesinOrbit: I'm not certain that I follow you, but you seem to be saying that for any `char* x`, incrementing x is **always** UB because its type is explicitly "pointer to one char". Is that what you're saying? – Drew Dormann Mar 05 '12 at 21:54
  • @DrewDormann: I'm _wondering_ whether that's the case. – Lightness Races in Orbit Mar 05 '12 at 22:02
  • @bames53: `char*` and `char(*)[50]` are two distinct types. – Lightness Races in Orbit Mar 05 '12 at 22:03
  • @LightnessRacesinOrbit I'm don't think my comment is confusing the those types. The C++ standard says that adding to/subtracting from a pointer is defined "If the pointer operand points to an element of an array object, and the array is large enough." `new char[50]` returns a pointer to the first element of an array object. Therefore adding to and subtracting from that pointer value is defined within that range. Just like in the case `char a[50]; char *p = a;` `a` decays to a pointer to the first element and then it's defined behavior to add to and subtract from the pointer within that range. – bames53 Mar 05 '12 at 22:27
  • @bames53: No, the memory allocated by `new char[50]` is not an array object. It is a block of 50 bytes containing 50 `char`s. – Lightness Races in Orbit Mar 05 '12 at 22:31
  • 2
    @LightnessRacesinOrbit It is an array object. See § 5.3.4 [expr.new] p5 – bames53 Mar 05 '12 at 23:24
  • 2
    @bames53: Ah, excellent! "I'd guess there's an allowance for new[]" would seem to be answerable "yes, there is" then. – Lightness Races in Orbit Mar 05 '12 at 23:50
  • @bames53 Maybe you can help. What part of § 5.3.4 says that is legal? Here is what i implemented. Is this legal? http://stackoverflow.com/questions/9584641/is-this-code-legal-portable-accesing-data-outside-of-struct –  Mar 06 '12 at 13:48
  • This is definitely UB and gcc 4.8 is now capable of noticing and optimizing away code relying on it. – strcat May 13 '13 at 20:16