23

I'm integrating some code into my library. It is a complex data structure well optimized for speed, so i'm trying not to modify it too much. The integration process goes well and actually is almost finished (it compiles). One thing is still bothering me. I'm getting the C4200 warning multiple times:

warning C4200: nonstandard extension used : zero-sized array in struct/union
Cannot generate copy-ctor or copy-assignment operator when UDT contains a zero-sized array

The code works but this warning gives me creeps (especially the part with copy-ctor). THe warning appears because of structures declared like this:

#pragma pack( push )
#pragma pack( 1 )
// String
struct MY_TREEDATSTR
{
    BYTE btLen;
    DWORD dwModOff;
    BYTE btPat[0];
};

typedef MY_TREEDATSTR TREEDATSTR;
typedef MY_TREEDATSTR *PTREEDATSTR;

#pragma pack( pop )

Note the btPat[0]. Is there a way how to easily and correctly get rid of this warning without breaking the code and/or having to change too much in it. Notice the #pragma's, have their any significance according to this warning? And why is the structure declared this way anyway? (I mean the btPat thing, not the #pragma's, those i understand).

Note: i saw this similar question, but it really didn't help me.

Update: as I said, the code works and gives correct results. So a copy-constructor or assignment operator is apparently really not needed. And as i look at the code, none of the structures get memcpy-ed.

Community
  • 1
  • 1
PeterK
  • 6,287
  • 5
  • 50
  • 86
  • 1
    You use a `std::vector` if you want a dynamic array. – GManNickG Jul 28 '10 at 07:47
  • 3
    That's a strange packing to have; you will have a misaligned `DWORD`. – dreamlax Jul 28 '10 at 07:53
  • The "similar question" is actually about a standard construct, whereas yours is an nonstandard extension. – MSalters Jul 28 '10 at 07:54
  • 4
    You have another unrelated problem - creating names like _TREEDATSTR that begin with an underscore and an uppercase letter is illegal in user code. –  Jul 28 '10 at 08:00
  • @PeterK: The `#pragma` directives remove any padding that the compiler would otherwise insert (usually for optimal access or required alignment for the target architecture). – dreamlax Jul 28 '10 at 08:03
  • @Neil: i know, the underscore is there to obfuscate the real code, since there was a prefix which would easily identify the company (product) I work for. I just deleted it and the underscore remained there. Edited to make you happy ;) – PeterK Jul 28 '10 at 10:00

6 Answers6

15

If this is a MSVC compiler (which is what the warning message tells me), then you can disable this warning using #pragma warning, ie.:

#pragma warning( push )
#pragma warning( disable : 4200 )
struct _TREEDATSTR
{
    BYTE btLen;
    DWORD dwModOff;
    BYTE btPat[0];
};
#pragma warning( pop )

BTW, the message about the copy-constructor is not creepy, but a good thing because it means, that you can't copy instances of _TREEDATSTR without the unknown bytes in btPat: The compiler has no idea how big _TREEDATSTR really is (because of the 0-size array) and therefore refuses to generate a copy constructor. This means, that you can't do this:

_TREEDATSTR x=y;

which shouldn't work anyway.

Nordic Mainframe
  • 28,058
  • 10
  • 66
  • 83
  • 5
    Chances are that a copy constructor or copy-assignment operator is not needed anyway, a "C-style dynamically-sized struct" like this one is usually copied with `memcpy` etc. – dreamlax Jul 28 '10 at 08:02
15

I'll assume that you do want this to be compiled in pure C++ mode, and that you don't want just to compile some files in C and some in C++ and later link.

The warning is telling you that the compiler generated copy constructor and assignment will most probably be wrong with your structure. Using zero-sized arrays at the end of a struct is usually a way, in C, of having an array that is decided at runtime, but is illegal in C++, but you can get similar behavior with a size of 1:

struct runtime_array {
   int size;
   char data[1];
};
runtime_array* create( int size ) {
   runtime_array *a = malloc( sizeof(runtime_array) + size ); // [*]
   a->size = size;
   return a;
}
int main() {
   runtime_array *a = create( 10 );
   for ( int i = 0; i < a->size; ++i ) {
      a->data[i] = 0;
   }
   free(a);
}

This type of structures are meant to be allocated dynamically --or with dynamic stack allocation trickery--, and are not usually copied, but if you tried you would get weird results:

int main() {
   runtime_array *a = create(10);
   runtime_array b = *a;          // ouch!!
   free(a);
}

In this example the compiler generated copy constructor would allocate exactly sizeof(runtime_array) bytes in the stack and then copy the first part of the array into b. The problem is that b has a size field saying 10 but has no memory for any element at all.

If you still want to be able to compile this in C, then you must resolve the warning by closing your eyes: silent that specific warning. If you only need C++ compatibility, you can manually disable copy construction and assignment:

struct runtime_array {
   int size;
   char data[1];
private:
   runtime_array( runtime_array const & );            // undefined
   runtime_array& operator=( runtime_array const & ); // undefined
};

By declaring the copy constructor and assignment operator the compiler will not generate one for you (and won´t complain about it not knowing how). By having the two private you will get compile time errors if by mistake you try to use it in code. Since they are never called, they can be left undefined --this is also used to avoid calling it from within a different method of the class, but I assume that there are no other methods.

Since you are refactoring to C++, I would also make the default constructor private and provide a static public inlined method that will take care of the proper allocation of the contents. If you also make the destructor private you can make sure that user code does not try to call delete on your objects:

struct runtime_array {
   int size;
   char data[1];
   static runtime_array* create( int size ) {
      runtime_array* tmp = (runtime_array*)malloc(sizeof(runtime_array)+size);
      tmp->size = size;
      return tmp;
   }
   static void release( runtime_array * a ) {
      free(a);
   }
private:
   runtime_array() {}
   ~runtime_array() {}
   runtime_array( runtime_array const & );            // undefined
   runtime_array& operator=( runtime_array const & ); // undefined
};

This will ensure that user code does not by mistake create your objects in the stack nor will it mix calls to malloc/free with calls to new/delete, since you manage creation and destruction of your objects. None of this changes affects the memory layout of your objects.

[*] The calculation for the size here is a bit off, and will overallocate, probably by as much as sizeof(int) as the size of the object has padding at the end.

David Rodríguez - dribeas
  • 204,818
  • 23
  • 294
  • 489
  • Attention: If you define copy-assignment, then runtime_array does not qualify as a POD anymore. – Nordic Mainframe Jul 28 '10 at 08:41
  • @Luther Blissett: right, but was it a POD in the first time? I am not talking about the compiler but the semantics. For the compiler it looks like a POD but many of the operations that can be performed on a POD cannot be performed in that object, like bitwise copying in copy assignment. – David Rodríguez - dribeas Jul 28 '10 at 08:47
  • The OP's struct looks like a POD. OTOH, referring to a non-static field of an non-POD before the constructor began execution is UB (12.7), `a=malloc(); a->size=xxx` is just that. – Nordic Mainframe Jul 28 '10 at 09:03
  • @Luther Blisset: Technically I agree, but in this particular case the default constructor will perform no operation at all on the memory, so the object will be exactly the same before and after calling the constructor. You could add a noop call to in-place new there: `new (a) runtime_array;`, but that is guaranteed by the standard not to do anything --an empty constructor for the class will leave the only field that the compiler could possibly touch uninitialized. I know that this is walking on the edge, but I believe that it can be assumed to be safe with the current class definition. – David Rodríguez - dribeas Jul 28 '10 at 09:13
  • Note that the guarantees that the standard makes for POD types do not hold with the general use case of empty arrays. In particular the standard *guarantees* that for any POD type: `memcpy( &a, &b, sizeof(T))` will end up with `a` and `b` being exactly the same, and that is not true in this particular case (and with the common usage pattern from C): see the example with the `ouch!!` comment above. The struct itself was a POD, but the usage was not that of a POD --all the memory acquired beyond the end of the struct and accessed through the empty array is non-POD of sorts. – David Rodríguez - dribeas Jul 28 '10 at 09:22
  • 1
    I'm totally with you in this thing. C++ is sometimes overly strict in incomprehensible ways. Anyway, if I'm getting this right then C++0x fixes this and allows PODs with *trivial* ctors/dtor/assigment which would make your code defined. – Nordic Mainframe Jul 28 '10 at 10:37
  • 1
    I could not get the "zero sized array" warning to go away in MSVC 2012, without disabling warning 4200 explicitly. Declaring explicit private copy-constructors and copy-assignment did not help. – Mark Lakata Jun 09 '14 at 20:07
  • 1
    @MarkLakata: Uhm... yes, rereading the answer I noticed that I forgot to mention that while the size of the array being 0 is legal in C, it is not in C++ 8.3.4/1 *If the constant-expression (5.19) is present,it shall be an integral constant expression and its value shall be greater than zero.* You will need to use `1` instead of `0`, some of your objects may pay a bit more for the memory, but hopefully not too much (probably a word) – David Rodríguez - dribeas Jun 12 '14 at 12:32
  • I use this technique along with placement new of the array elements. If you make the array initially of size 1, you end up in different kinds of trouble, given that then the 1 element is being default initialized by the objects constructor. Depending on the type of array elements, you could get a memory leak. For sure, you might get double destructor call on that first array element. – BitTickler Mar 21 '16 at 06:31
3

Try changing it to say btPat[1] instead. I think both C++ and C standards dictate that an array cannot have 0 elements. It could cause problems for any code that rely on the size of the _TREEDATSTR struct itself, but usually these sorts of structs are typecast from buffers where (in this case) the first byte of the buffer determines how many bytes are actually in btPat. This kind of approach relies on the fact that there is no bounds checking on C arrays.

dreamlax
  • 93,976
  • 29
  • 161
  • 209
  • 1
    To be precise, according to the standard, only array not allocated on the heap cannot have zero elements: int NotStandard[0]; int * Standard = new int[0]; – Francesco Jul 28 '10 at 08:28
  • 1
    "both C++ and C standards dictate that an array cannot have 0 elements" - C99 can. See https://en.wikipedia.org/wiki/Flexible_array_member It's just C++ which is unnecessarily strict here, though gcc/clang/msvc do actually support it. – Dwayne Robinson Sep 28 '17 at 00:43
1

If it's complaining about the copy constructor and assignment operator functions, couldn't you supply your own. If you don't want them, declare them private.

This may produce a lot of errors elsewhere in the code if you are assigning or copying without realising it, in which case it wouldn't have worked anyway because there are no automatically generated ones.

Brian Hooper
  • 21,544
  • 24
  • 88
  • 139
1

The main idea for this in C is to get for _TREEDATSTR elements the needed extra memory; in other words allocation will be done with malloc(sizeof(_TREEDATSTR) + len).

Pragma pack is used to ask the compiler to leave no empty spaces between the fields (normally compilers do sometimes leave some unused bytes between fields of structres to guarantee alignment because in many modern processors this is a huge speed improvement).

Note however that there are architectures where unaligned access is not just slow... but totally forbidden (segfault) so those compilers are free to ignore the pragma pack; code that uses pragma pack is inherently unportable.

I think I would have put the dword first in the structure, and this probably wouldn't have required a pragma pack; also a way to silence the warning is to allocate a one element array and doing the allocation using (len-1) extra bytes.

In C++ all this stuff is quite dangerous, because you're basically fooling the compiler into thinking that the size of the object is smaller than it really is, and given that C++ is a copy-logic language this means asking for troubles (for example for copy construction and assignment functions that will act only on the first part of the object). For everyday use it's surely MUCH better to use for example an std::vector instead, but this of course will come at an higher price (double indirection, more memory for every _TREEDATSTR instance).

I normally don't like thinking all other programmers are idiots, so if this kind of bad trickery has been used then probably there is a well paying reason for it... For a definitive judgment however a much deeper inspection would be needed.

To summarize:

  1. Using a zero element array at the end of an array is a trick used to create variable-sized objects. The allocation is done by requesting sizeof(structure) + n*sizeof(array_element) bytes.
  2. Pragma pack is used to tell the compiler to avoid adding extra padding bytes between structure fields. This is needed when a precise control on the memory layout is needed (for example because those object are being accessed by hand-written assembly)
  3. Don't do that in C++ unless you really need it and you know what you're doing

There is no way to "correctly" silence the warning because the code wants to play dirty (and C++ compilers don't like to be fooled about object size). If you use this object inside other objects, or as a base for other objects, or pass it aroud by value then whatever bad happens you asked for it.

GabeTheApe
  • 348
  • 2
  • 13
6502
  • 112,025
  • 15
  • 165
  • 265
  • Another reason why people do this is locality. Caches and cache misses have big impact on performance these days. Extra indirections, coming with newly allocated heap areas (std::vector) have negative impact on locality. – BitTickler Mar 21 '16 at 06:40
0

Although I realise that this is an old thread, I would like to give my pure c++11 solution to the OP's question. The idea is to wrap the to be allocated object, adding in the padding to align the objects in array to power of 2 adresses, in the following way:

template<typename T, std::size_t ObjectPaddingSize>
struct PaddedType : private T { private: char padding [ ObjectPaddingSize ]; };

template<typename T> // No padding.
struct PaddedType<T, 0> : private T { };

template<typename T>
struct PaddedT : private PaddedType<T, NextPowerOfTwo<sizeof ( T )>::value - sizeof ( T )> { };

The objects padding size can be calculated at compile-time with the following class (returns L if L is power of 2, else the next power of 2 gt L):

template<std::size_t L>
class NextPowerOfTwo {

    template <std::size_t M, std::size_t N>
    struct NextPowerOfTwo1 {

        enum { value = NextPowerOfTwo1<N, N & ( N - 1 )>::value };
    };

    template <std::size_t M>
    struct NextPowerOfTwo1<M, 0> {

        enum { value = M << 1 };
    };

    // Determine whether S is a power of 2, if not dispatch.

    template <std::size_t M, std::size_t N>
    struct NextPowerOfTwo2 {

        enum { value = NextPowerOfTwo1<M, M>::value };
    };

    template <std::size_t M>
    struct NextPowerOfTwo2<M, 0> {

        enum { value = M };
    };

public:

    enum { value = NextPowerOfTwo2<L, L & ( L - 1 )>::value };
};
degski
  • 642
  • 5
  • 13