0

I have a C++ class containing a bunch of data members of the same type and I want to iterate over them:

// C.h

class C {

  // other members

  double foo;
  double bar;
  ...
  double barf; // 57th double declared since foo, nothing else in between

  // other members
};

Pointer arithmetic seems to work, e.g. here using the constructor to initialize those 58 member doubles:

// C.cpp

C::C() {
  for (int i = 0; i < 58; i++) {
    *(&this->foo + i) = 0;
  }
}

I found related questions here How to iterate through variable members of a class C++, here C++: Iterating through all of an object's members?, here Are class members garaunteed to be contiguous in memory? and here Class contiguous data, with some people suggesting this kind of thing is ok and others a no-no. The latter say there's no guarantee it won't fail, but don't cite any instances of it actually failing. So my question is, does anyone else use this, or has tried and got into trouble?

Or maybe there's a better way? Originally in my application I did actually use an array instead to represent my object, with indices like so:

int i_foo = 0, i_bar = 1, ..., i_barf = 57;

However once I introduced different objects (and arrays thereof) the index naming started to get out of hand. Plus I wanted to learn about classes and I'm hoping some of the other functionality will prove useful down the line ;-)

I use the iteration pretty heavily, e.g. to calculate statistics for collections of objects. Of course I could create a function to map the class members to an array one-by-one, but performance is a priority. I'm developing this application for myself to use on Windows with VS. I would like to keep other platform options open, but it's not something I intend to distribute widely. Thanks

Community
  • 1
  • 1
George Skelton
  • 1,095
  • 1
  • 10
  • 22
  • 3
    Please don't. Just use an array. Or any other real data structure really. – Bartek Banachewicz Mar 06 '14 at 15:57
  • 1
    Though this is possible, I wonder why? I think you can have a better solution (like a method that will return the i-th attribute: double get(size_t idx) { switch (idx) { case 0: return foo; } }; – ebasconp Mar 06 '14 at 15:58
  • 1
    And then you add a virtual destructor or use a different memory layout and the iteration fails. So as mentioned before please use an array or something which offers iteration without hacks like this. – KimKulling Mar 06 '14 at 16:01
  • Rather than `int i_foo = ...`, you can use `enum C_Index { Foo, Bar, ...`, benefitting from the implicitly incrementing enumeration values and allowing a type-safe `operator[](C_Index i)` so you don't accidentally pass enums related to some other class. – Tony Delroy Mar 06 '14 at 16:01
  • It's only somewhat "possible". As soon as you add a virtual function then the compiler may just place a pointer to the vtable right where your first element is. Also, I'm pretty sure it can reorder members based on their visibility. – bstamour Mar 06 '14 at 16:02
  • You haven't really explained why you need to treat the data members of the class as an array. You haven't explained what you mean by "introduced different objects (and arrays thereof)"; are these objects in the first object or not? Right now, this looks like a case of premature optimization. – Mike DeSimone Mar 06 '14 at 16:13
  • Those of you concerned about this breaking if a virtual table is introduced: it won't so long as everything is `double`, because he uses `*(&this->foo + i)` (which I would have written ` `(&this->foo)[i]` for clarity) and not `*((double*)&this + i)`. – Mike DeSimone Mar 06 '14 at 16:16
  • Since you want to iterate over variables you need to choose a *container*, such as `std::vector`, `std::map` or `std::list`. These data structures provide support for *iterators*. – Thomas Matthews Mar 06 '14 at 16:25
  • Thanks for all the responses. @MikeDeSimone These are only doubles I'm iterating over, not other objects. By arrays I mean I have an array: C* ac = new C[n]; and I access the members as ac[j].foo. By different objects I mean I have another class D also with a sequence of member doubles, some of which have the same names and some of which don't. So if I were using arrays instead of classes I would then need to have indexing like i_Cfoo = 0, i_Cbar = 1, ... and i_Dbar = 0, i_Dbaz = 1, ... – George Skelton Mar 06 '14 at 17:39
  • And I need to iterate over it like an array to calculate e.g. the standard deviation of foo over ac and the std dev of bar etc. – George Skelton Mar 06 '14 at 17:44
  • @GeorgeSkelton: All of those points go towards why you should use a class; I'm still not sure why you'd want to treat the members of an object of that class as an array. I don't see how `C* ac = new C[n]` matters here; having an array of objects is orthogonal to the issue of having an array of doubles in a class, or why you have 57 doubles declared in your class. – Mike DeSimone Mar 06 '14 at 21:14
  • @oopscene Sorry for the delay - just back from vacation. I tried your idea and created hard coded set & get functions. The performance was identical with optimization (although ~3x slower without). Since this enables me to keep the functionality of classes whilst guaranteeing the serial data access, I will accept yours as the answer if you post it as such. Thanks – George Skelton Mar 17 '14 at 11:53
  • @GeorgeSkelton: Cool, thanks! :) I already posted it. – ebasconp Mar 17 '14 at 17:52

2 Answers2

4

George:

I think you can have a better solution (like a method that will return the i-th attribute:

double get(size_t idx)
{
  switch (idx)
  {
    case 0: return foo; 
    case 1: return bar;
    case 2: return foo_bar;
    ....
  }
}
ebasconp
  • 1,608
  • 2
  • 17
  • 27
1

Using pointer arithmetic to iterate over class data members can cause problems during code optimization. Example:

struct Vec3
{
    double x, y, z;

    inline Vec3& operator =(const Vec3& that)
    {
        x = that.x;
        y = that.y;
        z = that.z;
        return *this;
    }

    inline double& operator [](int index)
    {
        return (&x)[index];
    }
};

...
Vec3 foo = bar;            // operator =
double result = foo[2];    // operator []
...

Both operators are inlined, the value of the result depends on the final instructions reordering. Possible cases:

foo.x = bar.x;
foo.y = bar.y;
foo.z = bar.z;
result = (&foo.x)[2];    // correct -- result contains new value

foo.x = bar.x;
foo.y = bar.y;
result = (&foo.x)[2];    // incorrect -- result contains old value
foo.z = bar.z;

foo.x = bar.x;
result = (&foo.x)[2];    // incorrect -- result contains old value
foo.y = bar.y;
foo.z = bar.z;

Some compilers just do not realise that (&foo.x)[2] is the same data as foo.z and they reorder instructions incorrectly. It is very hard to find bugs like this.