2

Yesterday I read some code of a colleague and came across this:

class a_class
{
public:
    a_class() {...}
    int some_method(int some_param) {...}

    int value_1;
    int value_2;
    float value_3;
    std::vector<some_other_class*> even_more_values;
    /* and so on */
 }

 a_class a_instances[10];

 void some_function()
 {
     do_stuff();
     do_more_stuff();

     memset(a_instances, 0, 10 * sizeof(a_class)); // <===== WTF?
 }

Is that legal (the WTF line, not the public attributes)? To me it smells really, really bad... The code ran fine when compiled with VC8, but it throws an "unexpected exception" when compiled with VC9 when calling a_instances[0].event_more_values.push_back(whatever), but when accessing any of the other members. Any insights?

EDIT: Changed the memset from memset(&a_instances... to memset(a_instances.... Thanks for pointing it out Eduard.
EDIT2: Removed the ctor's return type. Thanks litb.

Conclusion: Thanks folks, you confirmed my suspicion.

EricSchaefer
  • 25,272
  • 21
  • 67
  • 103

8 Answers8

9

This is a widely accepted method for initialization for C structs.
In C++ it doesn't work ofcourse because you can't assume anything about vectors internal structure. Zeroing it out is very likely to leave it in an illegal state which is why your program crashes.

shoosh
  • 76,898
  • 55
  • 205
  • 325
5

He uses memset on a non-POD class type. It's invalid, because C++ only allows it for the simplest cases: Where a class doesn't have a user declared constructor, destructor, no virtual functions and several more restrictions. An array of objects of it won't change that fact.

If he removes the vector he is fine with using memset on it though. One note though. Even if it isn't C++, it might still be valid for his compiler - because if the Standard says something has undefined behavior, implementations can do everything they want - including blessing such behavior and saying what happens. In his case, what happens is probably that you apply memset on it, and it would silently clear out any members of the vector. Possible pointers in it, that would point to the allocated memory, will now just contain zero, without it knowing that.

You can recommend him to clear it out using something like this:

...
for(size_t i=0; i < 10; i++)
    objects[i].clear();

And write clear using something like:

void clear() {
    a_object o;
    o.swap(*this);
}

Swapping would just swap the vector of o with the one of *this, and clear out the other variables. Swapping a vector is especially cheap. He of course needs to write a swap function then, that swaps the vector (even_more_values.swap(that.even_more_values)) and the other variables.

Johannes Schaub - litb
  • 496,577
  • 130
  • 894
  • 1,212
  • Never mind the ctor's return type. I wrote the above code from the top of my head. – EricSchaefer Jan 31 '09 at 16:23
  • As I wrote in a comment to Pierre, the class had already a correct clear() method, but he wasn't using it. After I told him my head aches with the memset he changed the code to a for-statement just like the you suggest. – EricSchaefer Jan 31 '09 at 16:29
4

I am not sure, but I think the memset would erase internal data of the vector.

4

When zeroing out a_instances, you also zero out the std_vector within. Which probably allocates a buffer when constructed. Now, when you try to push_back, it sees the pointer to the buffer being NULL (or some other internal member) so it throws an exception.

It's not legitimate if you ask. That's because you can't overload writing via pointers as you can overload assignment operators.

3

You shouldn't do memset on C++ objects, because it doesn't call the proper constructor or destructor.

Specifically in this case, the destructor of even_more_values member of all a_instances's elements is not called.

Actually, at least with the members that you listed (before /* and so on */), you don't need to call memset or create any special destructor or clear() function. All these members are deleted automatically by the default destructor.

Igor
  • 26,650
  • 27
  • 89
  • 114
3

You should implement a method 'clear' in your class

void clear()
  {
  value1=0;
  value2=0;
  value_3=0f;
  even_more_values.clear();
  }
Pierre
  • 34,472
  • 31
  • 113
  • 192
  • clear is not needed. even_more_values is a vector, and not a pointer – Igor Jan 31 '09 at 16:10
  • Even then, clear() won't dispose the objects that are contained in even_more_values. One of the big problems in the original question is one of ownership of allocated objects... – Bids Jan 31 '09 at 16:35
3

The worst part of it is that if the vector had anything in it, that memory is now lost because the constructor wasn't called.

NEVER over-write a C++ object. EVER. If it was a derived object (and I don't know the specifics of std::vector), this code also over-writes the object's vtable making it crashy as well as corrupted.

Whoever wrote this doesn't understand what objects are and needs you to explain what they are and how they work so that they don't make this kind of mistake in the future.

Michael Kohne
  • 11,888
  • 3
  • 47
  • 79
  • That should be never overwrite a non-POD object. See litb's answer for what a POD is. – KeithB Jan 31 '09 at 18:29
  • I doubt that losing the few words of pre-allocated buffer in the vectors are really much of an issue compared to the access violations that are going to be occurring. It's amazing that this has ever worked. – Eclipse Jan 31 '09 at 18:33
  • The thing is, if std::vector has no vtable (and thus no indirection), AND it happens to use NULL to denote that it's internal pointers aren't populated, then you might get away with stomping it to all zeros. You'd still lose memory, but you might well not crash. – Michael Kohne Jan 31 '09 at 20:06
2

What you have here might not crash, but it probably won't do what you want either! Zeroing out the vector won't call the destructor for each a_class instance. It will also overwrite the internal data for a_class.even_more_values (so if your push_back() is after the memset() you are likely to get an access violation).

I would do two things differently:

  1. Use std::vector for your storage both in a_class and in some_function().
  2. Write a destructor for a_class that cleans up properly

If you do this, the storage will be managed for you by the compiler automatically.

For instance:

class a_class
{
public:
    a_class() {...}
    ~a_class() { /* make sure that even_more_values gets cleaned up properly */ }

    int some_method(int some_param) {...}

    int value_1;
    int value_2;
    float value_3;
    std::vector<some_other_class*> even_more_values;
    /* and so on */
 }

 void some_function()
 {
     std::vector<a_class> a_instances( 10 );

     // Pass a_instances into these functions by reference rather than by using
     // a global. This is re-entrant and more likely to be thread-safe.
     do_stuff( a_instances );
     do_more_stuff( a_instances );

     // a_instances will be cleaned up automatically here. This also allows you some
     // weak exception safety.
 }

Remember that if even_more_values contains pointers to other objects, you will need to delete those objects in the destructor of a_class. If possible, even_more_values should contain the objects themselves rather than pointers to those objects (that way you may not have to write a destructor for a_class, the one the compiler provides for you may be sufficient).

Bids
  • 2,422
  • 18
  • 26