-2
#include <windows.h> 
#include <stdio.h>

WCHAR *HiveName[4] = {L"HKCR", L"HKCU", L"HKLM", L"HKU"};

int wmain( INT argc, WCHAR **argv )
{
    for ( DWORD i = 0x80000000; i < 0x80000004; i++ )
        wprintf(L"%lu %s\n", i, HiveName[i]);
    return 0;
}

Output:

2147483648 HKCR

2147483649 HKCU

2147483650 HKLM

2147483651 HKU

Why does it work?

ZarNi Myo Sett Win
  • 1,353
  • 5
  • 15
  • 37
  • 3
    You're going out of bounds of the array, that leads to [*undefined behavior*](https://en.wikipedia.org/wiki/Undefined_behavior), which really makes all speculation about behavior useless. – Some programmer dude Apr 26 '18 at 02:41
  • Are you compiling as x86 or x64? – Jack C. Apr 26 '18 at 03:01
  • Also, if you really want to know what is happening here, it's useful to look at the [assembly listing](https://stackoverflow.com/questions/1020498/how-to-view-the-assembly-behind-the-code-using-visual-c). – Jack C. Apr 26 '18 at 03:17
  • x86 ... I read that static arrays are limited in size to 0x7FFFFFFF. So wrapping indices there makes some sense. – Vince_Fatica Apr 26 '18 at 03:28
  • From the ASM ... It's multiplying the index by 4 (size of DWORD) to get bytes I imagine. Wrapped at 2^32, that gives 0, 1, 2, 3. It makes sense. – Vince_Fatica Apr 26 '18 at 03:49

1 Answers1

0

First, as Some Programmer Dude says, out of bounds array indexing is undefined behavior. This means that according to the ISO C++ standard, the compiler is allowed to emit anything at all. The compiler could even emit a virus that encrypts your hard drive, and it would still be a standards compliant compiler.

That having been said, I have some speculation about what may be happening.

On Windows, x86 user space processes can use virtual addresses from 0x00000000 to 0x7fffffff. 0x80000000 and above is reserved for the kernel by default, although there are ways to increase this as high as 3 GB. In any case, it seems the limit on any particular allocation is 2 GB, so there's absolutely no way that an index of 0x80000000 or higher could point to a validly allocated object. The compiler is then free to emit code on the assumption that i must somehow be less than 0x80000000.

In this case there may not be any real "optimization." One version of the MSVC compiler outputs the following for the array indexing operation:

push    DWORD PTR wchar_t ** HiveName[esi*4]

Here, esi contains the index, i. It gets multiplied by 4, which is sizeof(wchar_t*), the array element. This overflows, and it happens to always give the right answer because the most significant bit always gets thrown out.

Jack C.
  • 744
  • 3
  • 7
  • Thanks, Jack C.! Of course, it multiplies by 4 to get to the next pointer, not to the next DWORD as I said in an earlier comment. So it was indefined behavior together with a bit of luck. I was doing arithmetic on the actual values of HKEY_CLASSES_ROOT (0x80000000) and friends ... discovered my own mistake but then wondered why it worked anyway. – Vince_Fatica Apr 26 '18 at 04:53