Debugger-friendly zero-cost named references to specific locations in an array

Question

struct A
{
    int a   = 1;
    short b = 2;
    char c  = 3;
}

struct B
{
    using arr_type = array<A,3>;
    char asd = 0;
    A a1;  
    A a2;
    A a3;

    // is this safe to use to loop trough all 3 elements?
    arr_type* p1 = reinterpret_cast<arr_type*>(&a1);

    // or maybe this one?
    A* p2 = &a1;
};

Can I safely use p1 or p2 to loop from a1...a3 ?

B b;

for (int i = 0; i < 3; i++)
{
     cout << p1[i];
     cout << p2[i];
}

The reason why it's not a simple array is because I want each "item" to have a proper name.

I could instead use the union approach, but the C++ prohibits anonymous structs (altho this is not a problem for me since MSVC supports this and GCC seems to support it as well);

union E
{
    A arr[3];
    struct {
        A a1;
        A a2;
        A a3;
    };
};

And the following is clearly safe, but it has a 4 byte overhead for each reference. Which I don't like. (Plus the cost to initialize the references..)

struct B
{
    char asd;
    A arr[3];  
    A& a1 = arr[0];
    A& a2 = arr[1];
    A& a3 = arr[2];
};

And this one has no overhead but for my very specific case, it's not good enough.

struct B
{
    char asd;
    A arr[3];  
    A& a1() { return arr[0] };
    A& a2() { return arr[1] };
    A& a3() { return arr[2] };
};

I'm gonna be using those a1, a2, a3 names very often, and it's harder to debug them if they are function calls in visual studio. And again, I'm going to be using those fields a lot, so I want to be able to check their values easily.

To get an actual, useful answer you might want to explain what you are trying to do. — user657267, Apr 07 '16 at 04:48
Why not have `A& a1() { return arr[0]; }` / `const A& a1() const { return arr[0]; }` etc..? Lets you refer to the array entries using your presumably-meaningful "a1", "a2" names, but without the potential storage overhead of references, and it will all be optimised away at even the lower optimisation settings. — Tony Delroy, Apr 07 '16 at 04:51
*"I just dont want to use it"* isn't very useful - if you say why don't you want to use it, that gives some criterion for "better" that might lead to other answers... — Tony Delroy, Apr 07 '16 at 04:57
I think it's safe since it counts as the same "object" in the sense of allocation, and the correct types are always used, and the pointers are in-bound. — o11c, Apr 07 '16 at 05:18
C++ also prohibits union aliasing so don't even start on that one — M.M, Apr 07 '16 at 06:48
If you're asking the question "is this safe" then most likely you shouldn't do it anyway. — gnasher729, Apr 07 '16 at 07:55
I feel revision 8 is not an improvement upon the title, and that it doesn't make sense as an edit as it puts words into the op's mouth. Rollback, yay or nay? — Ultimater, Apr 07 '16 at 18:57

score 7 · Accepted Answer · edited Jun 20 '20 at 09:12

7

struct B
{
     using arr_type = array<A,3>;
     char asd = 0;
     A a1;  
     A a2;
     A a3;

     // is this safe to use to loop trough all 3 elements?
     arr_type* p1 = reinterpret_cast<arr_type*>(&a1);
 };

Structs need to align naturally for their types, and so do arrays, but I don't know of any rule saying they have to be the same alignment points.

If there were such a rule that struct layout boundaries for members like this and array boundaries will be the same--it would only apply to standard layout structs:

https://stackoverflow.com/a/7189821/211160

All bets would be off if you did something like:

private:
    A a1;
    A a2;
public:
    A a3;

I'd imagine all bets would be off if it contained anything that flipped off the standard layout switch. As it was questionable to start with, I'd say don't even do it then.

(I'd also wonder what kind of differences #pragma pack() would throw in for arrays vs. structs...not that #pragmas are in the standard, I just wonder.)

union E
{
    A arr[3];
    struct {
        A a1;
        A a2;
        A a3;
    };
};

No, arr[N] and aN would not be equivalent. There are some subtle details about how you can use initial sequences in unions for compatible reading in C++...but that's only between structures with compatible sequences. It says nothing about a struct and an array:

Type punning a struct in C and C++ via a union

I'm gonna be using those a1, a2, a3 names very often, and it's harder to debug them if they are function calls in visual studio. And again, I'm going to be using those fields a lot, so I want to be able to check their values easily.

"And the following is clearly safe, but it has a 4 byte overhead for each reference"

In practice it appears you are correct, that today's GCC isn't optimizing it out (per your links):

https://godbolt.org/g/6jAtD5

http://ideone.com/zZqfor

That's disappointing and they could be optimized out, as there's nothing in the standard saying they have to take up space. They point internally to the structure, and they don't change for the lifetime of the structure. :-/

Your complaint against the function access which would be optimized away was that it wasn't debugger-friendly enough. Why not do both?

struct B
{
    char asd;
    A arr[3];

    A& a1() { return arr[0] }
    const A& a1() const { return arr[0]; }
    A& a2() { return arr[1] };
    const A& a2() const { return arr[1]; }
    A& a3() { return arr[2] };
    const A& a3() const { return arr[2]; }

#if !defined(NDEBUG)
    A& a1_debug = arr[0];
    A& a2_debug = arr[1];
    A& a3_debug = arr[2];
#endif
};

If debugger-friendliness features of projecting your data structures is important to you...it might be a good use of time to learn how to write custom debugger helpers for your environment, e.g.:

http://doc.qt.io/qtcreator/creator-debugging-helpers.html

I guess whether that's worth it depends on how often you have this kind of concern.

edited Jun 20 '20 at 09:12

Community

1
1

answered Apr 07 '16 at 05:20

HostileFork says dont trust SE

32,904
11
98
167

@Phantom You edited your question [a lot of times](http://stackoverflow.com/posts/36466876/revisions) while I was answering, hard to keep track. – HostileFork says dont trust SE Apr 07 '16 at 05:23
Yes, I did, mainly because user657267 hinted it wasn't good enough. I'm going to use a1, a2, a3 often, and keep using them in release builds. The names are obviously not a1, a2 , a3 tho, and I have more than 3 variables as well. – Gam Apr 07 '16 at 05:29
1

I would like to nicely ask more data about your claim that compilers can optimize a struct (and remove a field if it's not used). This, in my eyes, don't make any sense. – Gam Apr 07 '16 at 05:45
As far as I'm aware, structs do guarantee that the order in memory is the same as declared, and fields would not be optimized away. Having said that, you're not guaranteed contiguous memory for fields either, because of alignment. I don't have time to find any relevant information in the standard right now to verify what I just said, so I'll just leave this here as a comment. – Bart van Nierop Apr 07 '16 at 05:53
@BartvanNierop If a class type has an int and a char (5 bytes), and it's not aligned, then the next member with the same type should have 5 bytes as well. If it's aligned, and has 6 bytes, and the next one should have 6 bytes too. It doesn't makes sense why would a compiler align a member with X type and then not align the next one that has the same type. – Gam Apr 07 '16 at 06:14
@Phantom Added info suggesting you likely need not worry about the reference cost. Regarding reference members being optimized out--there is the "proof" of the assembly, and the C++ FAQ: *Unlike a pointer, once a reference is bound to an object, it can not be "reseated" to another object. The reference itself isn't an object (it has no identity; taking the address of a reference gives you the address of the referent; remember: the reference is its referent).* I believe data members could theoretically be optimized out as far as the standard reaches, but you'd probably not like that compiler. – HostileFork says dont trust SE Apr 07 '16 at 06:33
@HostileFork A reference will always increase the object size by 4 (if running in a 32 bit machine). This is why I always see them as pointers. Try it yourself. https://godbolt.org/g/lHQiy6 – Gam Apr 07 '16 at 06:36
@Phantom Um. I did try it myself. See above. But that was with GCC with those optimization settings. This is not something that the standard specifies--beyond saying that references themselves aren't objects and have no addresses--hence the optimization is possible, so it depends. Your compiler and optimization settings may vary. – HostileFork says dont trust SE Apr 07 '16 at 06:38
@HostileFork That's because you are using an array of 0 elements. Try using one of 3 elements (like I need), and then create a reference to one of them. The size will be sizeof(void*) bytes bigger, even with optimizations on. I've tried it myself. https://godbolt.org/g/6jAtD5 or ideone (C++14 as well) http://ideone.com/zZqfor and MSVC 2015 update 2 gives the same output as well. – Gam Apr 07 '16 at 06:44
You could cut out all the objections regarding alignment, padding, "optimizing out a2", etc. by checking that `sizeof` the struct is what you expect before proceeding – M.M Apr 07 '16 at 06:51
@M.M I was considering that. But my B class has more members, and I would have to update it everything a change is made. Do you think this code should be enough? `static_assert((offsetof(B, a3) - offsetof(B, a1)) == (sizeof(A) * (3-1)), "Error");` (or mb instead of using a3, I could use the next member after a3, and just `(sizeof(A) * 3) ` – Gam Apr 07 '16 at 07:23
@Phantom Sorry for the typo... I meant arr[1]. Well that's interesting. Added another suggestion, maybe you could have the references in the debug build under another label, and use inline functions for your in-program accesses? – HostileFork says dont trust SE Apr 07 '16 at 07:32
@HostileFork I think I will use the static assert approach. http://ideone.com/a41tym Seems safe. What you think? The reason why I want to use std::array as well is because I can pass it to functions and inherit the "size" parameter, and I can use it in for range loops too, and so on. – Gam Apr 07 '16 at 07:44
@Phantom std::array is good to use anywhere you can. As for if a static_assert is enough, I guess that's your call. Whether it worked or not...I'd probably look at such code skeptically and take it as a warning sign that the person doing it was being "unnecessarily tricky"...when setters and getters would be more conventional, and also provide places to add instrumentation or bridge compatibility in the face of change (usual C++ stuff). What if your array changes where a formerly stored property can be computed? etc. – HostileFork says dont trust SE Apr 07 '16 at 08:19

score 2 · Answer 2 · answered Apr 07 '16 at 07:39

No need for such nastiness.

std::tuple coupled with a lambda gives you all the functionality you want. Plus it's perfectly legal, optimal and correct.

If we define a member function that returns a tuple of references to all As in the structure:

  auto all_a() const {
    return std::tie(a1, a2, a3);
  }

...then create a little plumbing for provide a means to walk over the tuple (see below)...

... we can write code like this:

  B b;
  for_each(b.all_a(), 
           [](const A& a) { std::cout << a << std::endl; });

Full example (although I didn't implement operator<<, you can do that yourself).

#include<iostream>
#include<array>
#include<tuple>
#include<utility>

using namespace std;

struct A
{
    int a   = 1;
    short b = 2;
    char c  = 3;
};
std::ostream& operator<<(std::ostream& os, const A& a);

struct B
{
    char asd = 0;
    A a1;  
    A a2;
    A a3;

  auto all_a() const {
    return std::tie(a1, a2, a3);
  }

};


template<class Tuple, size_t...Is, class F>
  void for_each_impl(const Tuple& t, std::index_sequence<Is...>, F&& f)
{
  using expand = int[];
  void(expand { 0, (void(f(std::get<Is>(t))),0)... });
}

template<class...Ts, class F>
void for_each(const std::tuple<Ts...> ts, F&& f)
{
  using expand = int[];
  for_each_impl(ts, 
                std::make_index_sequence<sizeof...(Ts)>(), 
                std::forward<F>(f));
}

int main()
{
  B b;
  for_each(b.all_a(), 
           [](const A& a) { std::cout << a << std::endl; });

}

Presumably the goal is a "guarantee" of compatibility with clients who expect arrays... and as an abstraction, [std::tuple would make even fewer promises](http://stackoverflow.com/questions/14597006/stdtuple-memory-alignment) of that... *"Not only is there no requirement that the objects be arranged any particular way, but many tuple implementations actually put the second object before the first one."* — HostileFork says dont trust SE, Apr 07 '16 at 07:54
@HostileFork the presumption of compatibility was not stated in the question so i can't address that. The actual physical order of items in the tuple doe not matter. What matters is the order of access. This is guaranteed since we're accessing through a sequence of calls to std::get with increasing I. — Richard Hodges, Apr 07 '16 at 08:06
@HostileFork for the record, if compatibility with clients that need an array was the goal, then I completely concur with your solution of storing the data in an array and providing proxy accessors. — Richard Hodges, Apr 07 '16 at 08:09
Good point it's not clear, title is ambiguous. I made it less so in favor of what I *think* the interpretation is...OP can correct if it's wrong. :-) — HostileFork says dont trust SE, Apr 07 '16 at 08:29

Debugger-friendly zero-cost named references to specific locations in an array

2 Answers2