0

I currently have this code:

struct LAYOUT {
            WORD a [8] = {::Rook, ::Knight, ::Bishop, ::Queen, ::King, ::Bishop, ::Knight, ::Rook};
            WORD b [8] = {::Pawn};
            WORD empty [32] = {::None};
            WORD c [8] = {::Pawn+0x6};
            WORD d [8] = {::Rook+0x6, ::Knight+0x6, ::Bishop+0x6, ::Queen+0x6, ::King+0x6, ::Bishop+0x6, ::Knight+0x6, ::Rook+0x6};
        }chessLayout;
LAYOUT* chessboard = &chessLayout;

The global enum fields such as ::Rook represent a word, ex: 0x2656.

My objective here is to enumerate all the elements in this structure, so I wrote this piece of code:

for(int i = 0; sizeof(LAYOUT) / 2;i++){
     printf("%x", *(reinterpret_cast<WORD*>(chessboard+i)));
}

However this returns me the first value right but then returns unrelated junk values and not the elements in the struct.

Any ideas?

Thanks in advance

Alex
  • 840
  • 7
  • 23
  • 1
    In `chessboard+i`, your chessboard pointer is still a `LAYOUT*`, right? You are incrementing the wrong size. I'd give `(reinterpret_cast(chessboard)+i)` a try. – Cedric Dec 08 '20 at 10:37
  • Even if you fix that, you will have UB once you access `b[0]` that way. as pointer arithmetic has specific rules. – Jarod42 Dec 08 '20 at 10:54
  • @Jarod42 I know it is UB to iterate outside of an array, but is that true even if the pointer is inside a valid array (even if it's not the "original" one)? – Cedric Dec 08 '20 at 10:58
  • 1
    Pointer arithmetic can only be done inside the array (+ end). we extend for that rule that regular variables are array of size one. so `a[8]` is UB, even if `&a[8] == &b[0]`. – Jarod42 Dec 08 '20 at 11:05
  • @Jarod42 Thanks. I remember the rule, but I am struggling to understand the difference between e.g. `*reinterpret_cast(chessboard)+0` and `*reinterpret_cast(chessboard)+8` (i.e. using the chessboard pointer that happens to fall on a, and one that happens falls on b) – Cedric Dec 08 '20 at 11:09
  • 1
    @Cedric: First member has special treatment. (I'm not even sure that accessing `a[1]` that way is legal, as the cast is to `WORD*` and not `WORD(*)[8]`) – Jarod42 Dec 08 '20 at 11:13
  • Revise your data structure to be iterable. Your definition of LAYOUT will bring issues. – YSC Dec 08 '20 at 11:17

2 Answers2

1

In the statement

LAYOUT* chessboard = &chessLayout;

/* ... */

for(int i = 0; sizeof(LAYOUT) / 2;i++){
     printf("%x", *(reinterpret_cast<WORD*>(chessboard+i)));
}

the pointer chessboard is incremented, which is of type LAYOUT*. This will move the pointer along "i * sizeof(LAYOUT)" bytes, whereas what you want is to move to the next WORD.

The expression should thus look like this (i outside the cast):

for(int i = 0; i < sizeof(LAYOUT) / 2; ++i){
     printf("%x", *(reinterpret_cast<WORD*>(chessboard)+i));
}

EDIT : As @Jarod42 pointed out, you might quickly run into UB, as we are iterating past the "end iterator" of the array a.

The exact interpretation is sometimes unclear and has cause numerous discussions on SO and in other places.

Here is a thread about this and using the C++17 offsetof functionality which should help in this context: Do we need to use std::launder when doing pointer arithmetic within a standard-layout object (e.g., with offsetof)?

I believe the detailed discussion of this is important, but has enough active threads as not to repeat it here.

Cedric
  • 278
  • 1
  • 9
  • 1
    Also: the check in the for loop always results in true. Should be `i < sizeof(LAYOUT)/2`. – Xoozee Dec 08 '20 at 11:00
1

To avoid exciting tango with UBs, I'd do this in opposite way, let layout contain an array of whole data, while a,b,c,d will be getters. Even casting pointers like in code below would be edge walking, but then we can iterate through data , not worried that it might be interrupted by padding words.

using WORD = unsigned short;

namespace LayoutNS { 
      
struct Layout {
    WORD data[64];
    
    template <size_t N>
    using line = WORD [N];
    
    template <size_t N>
    line<N>* operator[] ( line<N>* (*f)(Layout&)  )
    {
        return (*f)(*this);
    }   
};

}

using LAYOUT = LayoutNS::Layout;

// better to avoid such cast either, can we go just with a pointer? Or return a copy?
LAYOUT::line<8>* line_a (LAYOUT& l) { return (LAYOUT::line<8>*)(l.data); }
LAYOUT::line<8>* line_b (LAYOUT& l) { return (LAYOUT::line<8>*)(l.data+8); }
LAYOUT::line<32>* empty (LAYOUT& l) { return (LAYOUT::line<32>*)(l.data+16); }
LAYOUT::line<8>* line_c (LAYOUT& l) { return (LAYOUT::line<8>*)(l.data+48); }
LAYOUT::line<8>* line_d (LAYOUT& l) { return (LAYOUT::line<8>*)(l.data+56); }

int main()
{
    LAYOUT test = {};
    auto test_a = test[line_a];

    std::cout << (*test_a)[1];
}

A simplified version that avoids an indirection level and C++11 features:

struct LAYOUT {
    WORD data[64];
       
    WORD* operator[] ( WORD* (*f)(LAYOUT&)  )
    {
        return (*f)(*this);
    }   
};

WORD* line_a (LAYOUT& l) { return (l.data); }
WORD* line_b (LAYOUT& l) { return (l.data+8); }
WORD* empty (LAYOUT& l) { return (l.data+16); }
WORD* line_c (LAYOUT& l) { return (l.data+48); }
WORD* line_d (LAYOUT& l) { return(l.data+56); }

int main()
{
    LAYOUT test = {{1,3,4,5}};
    auto test_a = test[line_a];

    std::cout << test_a[1];
}
Swift - Friday Pie
  • 12,777
  • 2
  • 19
  • 42
  • 1
    That's a really smart way of thinking about it! However I'm going to keep it the way I did it since the struct is inside another complicated data structure, I also want to keep it relatively simple. But a big thanks anyway. – Alex Dec 08 '20 at 11:31
  • 1
    @Alex fair point, if that external code was relying on implementation heavily, I wrote it intentionally that way it may be used with minimal changes, `chessLayout.a` would become `chessLayout[line_a]` or something similar and nested class wouldn't need any extra namespaces to hide in, but it's up to you anyway. – Swift - Friday Pie Dec 08 '20 at 11:37
  • Yes I agree, the subscript operator makes it way more clearer. However could you explain what this code does `template using line = WORD [N];` and why you use it like `line*`. Thanks in advance – Alex Dec 08 '20 at 11:39
  • 1
    @Alex first is a template of type `line`, a WORD array of size N (`using` in C++11 and later acts as typedef and can be used with templates).. `f` is a function pointer passed to as a subscript that takes Layout& as argument and returns pointer to `line` , in example it get assigned `line_a`. Subscript operator returns same type as function passed to it. Whole thing likely would be inlined. – Swift - Friday Pie Dec 08 '20 at 11:42
  • 1
    So here, `line*` is a pointer to array of `N` words? I'm sorry but I'm still struggling understanding what you are doing in the `line_a` functions `(LAYOUT::line<8>*)(l.data);`, are you casting the data to be a WORD array of length 8? – Alex Dec 08 '20 at 11:48
  • 1
    @Alex yes, I casting pointer to WORD into a pointer to WORD[N]. It might be avoided at all if you do not need that static data for something (e.g. to determine size of array somewhere else statically, to iterate through it). If not required, may go with WORD* instead, code would look simpler, no cast required – Swift - Friday Pie Dec 08 '20 at 11:51
  • 1
    @Alex I added a simpler and more adoptable version. – Swift - Friday Pie Dec 08 '20 at 12:20