0

I was reading the source of a hashing competition today, and came across this:

#define BYTES_IN_BLOCK 1024

struct block{
    uint8_t v[BYTES_IN_BLOCK];

    block(){ memset(v, 0, BYTES_IN_BLOCK); }
    uint64_t& operator[](uint8_t i){ return *(uint64_t*)(v + 8 * i); }
};

Then, a little later in the code, there's this:

state = new block[cost]; // cost is a uint32_t (like 1024)
// Probably 50 lines of code.
block prev_block;

prev_block = state[foo]; // foo is a uint32_t

What I can't figure out is what this is doing. Now, I understand C, but C++ not so much. Bear with me here for a second.

This part: return *(uint64_t*)(v+8*i) should return a uint64_t, and does so when I tested it:

state->v[8*10] = 12;
uint64_t bar = *(uint64_t*)(v+8*10);
printf("%" PRIu64 "\n", bar);

So that all makes sense.

But this:

prev_block = state[foo];

Makes no sense. Since state is block*, prev_block should now "be" state, correct? But it doesn't, because their arrays are different.

state->v[8*12] = 12;
printf("%" PRIu64 "\n", (*state)[12]);
prev_block = state[12];
printf("%" PRIu64 "\n", (*(&prev_block))[12]);

So, what exactly is going on here?

Eric Lagergren
  • 501
  • 2
  • 7
  • 18
  • 2
    state is "block*" but prev_block is correctly "block" since it is defined as "state[foo]" which is " *(state + foo) ". – Pouya Tehrani Jul 21 '15 at 05:26
  • Ugh. This code is waay too clever. `struct block` should just be `std::array< std::uint64_t, BYTES_IN_BLOCK / sizeof (std::uint64_t) >`. The way it's written, `block prev_block` might be misaligned in memory. – Potatoswatter Jul 21 '15 at 06:20
  • Not directly related to your question, but this entire concept violates the [strict aliasing rule](http://stackoverflow.com/questions/98650/what-is-the-strict-aliasing-rule). (so all of the code has undefined behaviour) – M.M Jul 21 '15 at 06:34
  • @Potatoswatter `std::aligned_storage` is probably more appropriate – M.M Jul 21 '15 at 06:37
  • For the record, this isn't my code. It's from a recent hash algorithm contest. This is (scarily) part of the winning code. @MattMcNabb Thanks for that link -- it was a good read and a good reminder. – Eric Lagergren Jul 21 '15 at 06:45
  • @Potatoswatter You should see the rest of their code >:) – Eric Lagergren Jul 21 '15 at 06:45

3 Answers3

1
state = new block[cost];
prev_block = state[foo];

is analogous to:

int* arr = new int[size];
int a = arr[index];

That's basic C++. I am not sure why that is confusing.

R Sahu
  • 204,454
  • 14
  • 159
  • 270
  • Excuse me, but I don't recall mentioning I had a grasp on C++. That last bit was unwarranted. – Eric Lagergren Jul 21 '15 at 05:48
  • @eric_lagergren, I didn't mean anything offensive. I was just puzzled. – R Sahu Jul 21 '15 at 06:01
  • It's alright. If you're wondering, it was a dead simple question. The operator overloading happened to confuse me and cause me to overlook the simple array index bit. Thanks for your answer though! – Eric Lagergren Jul 21 '15 at 06:04
1

You are mixing up the two operator[]s involved here. In your last example, you set state[0][12] = 12, and you're comparing it to state[12][12]. Since state is a block*, state[n] is just normal array access; it doesn't invoke the operator[] defined in block.

Miles Budnek
  • 28,216
  • 2
  • 35
  • 52
  • And now after that explanation I feel ridiculously stupid for not catching that. Definitely not one of my brightest moments. Thanks, Miles. – Eric Lagergren Jul 21 '15 at 05:50
  • @eric_lagergren this is your answer, but there are a few other things going on in that code and your print test at the bottom you need to know about before they bite you. – user4581301 Jul 21 '15 at 06:15
1

There is confusion with a number of concepts here. I'm going to blast through all the ones I see because they are all important, not just the immediate answer.

state is a pointer to block, but state[0] should just be a block, specifically the first block in state and also the result of *state.

prev_block = state[foo];

All of the data in block is simple, just a self-contained array of bytes, so it should be directly copy-able without any special assistance. prev_block = state[foo] should copy state[foo] to prev_block. Since it's a copy, the addressing will be different.

In the printout code provided:

state->v[8*12] = 12;

Breaking his down for clarity. state-> is going to access the first element of the state array. state->v[8*12] is going to access v[8*12] of state[0]. state->v[8*12] = 12; is going to set v[8*12] of state[0]to 12. This means byte 96 of v is going to be 12. To reference a different state you can use (state + array_index)->v[8*12]; or state[array_index].v[8*12]; I find the latter more readable.

printf("%" PRIu64 "\n", (*state)[12]);

(*state) gives you the first state in the array, AKA state[0]. (*state)[12] uses state[0]'s [] operator, defined as uint64_t& operator[](uint8_t i){ return *(uint64_t*)(v + 8 * i); }

This is going to return a 64 bit int starting at the address of state[0].v[12*8] and comprised of the next 8 bytes of array v (v[96] through v[103], resulting in 12,0,0,0,0,0,0,0). This will be 12 or a god-awful big number depending on the system's endian. The wrapping printf is going to print the returned number.

prev_block = state[12];

Is going to copy the 13th element of the state array to prev_block, assuming enough blocks were created by state = new block[cost];. Nothing magical, but there shouldn't be anything there but zeros because the only state that has any values set is state[0]. You either wanted to copy state[0] here or write to state[12] up a few lines.

printf("%" PRIu64 "\n", (*(&prev_block))[12]);

the * and & cancel each other out before accomplishing anything. It will then print out the result of using the block [] operator as above. Should be zero.

user4581301
  • 33,082
  • 7
  • 33
  • 54
  • Wow. Thanks for this write up. Most of this I already understand, e.g., array indexes, pointers. The bit at the end was to shush the compiler from complaining about using an operator without a pointer. Anyway, thanks again. After Miles' answer I noticed I was making a stupid mistake that I _should_ have caught but didn't because the operator overloading was tripping me up. Great answer though. – Eric Lagergren Jul 21 '15 at 06:31
  • And for the record, other than my little test snippets, this isn't my C++. It's from a (Google's?) recent hash algorithm contest. This is part of the winning code. – Eric Lagergren Jul 21 '15 at 06:35
  • @eric_lagergren You might want make that point in your question. The code is already being dissected unfavourably. – user4581301 Jul 21 '15 at 06:39
  • Ehhh, I kind of like seeing the code get picked apart... I don't write C++ (only some C when needed) so it's been an interesting SO thread :-) – Eric Lagergren Jul 21 '15 at 06:41