1

I am having some odd behavior when using virtualalloc. I'm in c++, Visual Studio 2010.

I have two things I want to allocate, and I'm using VirtualAlloc (I have my reasons, irrelevant to the question)

1 - Space to hold a buffer of x86 assembly code
2 - Space to hold the data structure that the x86 code wants

In my code I am doing:

thread_data_t * p_data = (thread_data_t*)VirtualAlloc(NULL, sizeof(thread_data_t), MEM_COMMIT, PAGE_READWRITE);
//set up all the values in the structure
unsigned char* p_function = (unsigned char*)VirtualAlloc(NULL, sizeof(buffer), MEM_COMMIT, PAGE_EXECUTE_READWRITE);
memcpy(p_function, buffer, sizeof(buffer));
CreateThread( 0, (LPTHREAD_START_ROUTINE)p_function, p_data, 0, NULL);

in DEBUG mode: Works fine
in RELEASE mode: The spun up thread receives a null as its input data. Verified through debugging that when I call createThread the pointer is correct

if I switch the VirtualAlloc's around, so that I allocate the function space before the data space, then both DEBUG and RELEASE mode work fine.

Any ideas why? I've verified all my VS build settings are the same between DEBUG/RELEASE

Collin Dauphinee
  • 13,664
  • 1
  • 40
  • 71
Without Me It Just Aweso
  • 4,593
  • 10
  • 35
  • 53

1 Answers1

4

After copying assembly code into a memory buffer, you can't just jump straight into that buffer. You need to flush CPU caches and the like or it will not work. You can use FlushInstructionCache to do this.

https://msdn.microsoft.com/en-us/library/windows/desktop/ms679350%28v=vs.85%29.aspx

It's hard to say exactly why reordering the allocations would fix the issue, but if you copied the instructions into their buffer and then did a lot of work before jumping into the buffer, that would likely improve the odds of "getting away with it," as the CPU caches would have more of an opportunity to get flushed out by other means.

StilesCrisis
  • 15,972
  • 4
  • 39
  • 62
  • BTW I replaced my link to a better reference. The old link was more about detecting the case than solving it. – StilesCrisis Jan 26 '15 at 21:27
  • could you also put back the other link you had? So I can read about detecting it and solving it? thanks ! – Without Me It Just Aweso Jan 26 '15 at 21:35
  • Also in reading the link, I see " FlushInstructionCache is not necessary on x86 or x64 CPU architectures as these have a transparent cache. " – Without Me It Just Aweso Jan 26 '15 at 21:36
  • @WithoutMeItJustAweso: That does not appear in the documentation. Other users' comments could quite possibly be wrong. The official statement is "Applications should call FlushInstructionCache if they generate or modify code in memory." – Ben Voigt Jan 26 '15 at 21:37
  • Also, since you are involving multiple CPUs/threads, this may well be considered "Cross Modifying Code" which is an even more difficult situation for CPUs to recover from. – StilesCrisis Jan 26 '15 at 21:45
  • This is only really necessary if you are modifying existing code. If you are writing code to newly allocated memory, the CPU is not going to have a cache for it. – Gerald Jan 26 '15 at 21:50
  • There's no guarantee that `VirtualAlloc` will return a pristine untouched page that has never seen action before. – StilesCrisis Jan 26 '15 at 21:55
  • BTW, the old link I had was http://stackoverflow.com/questions/17395557/observing-stale-instruction-fetching-on-x86-with-self-modifying-code but I honestly think it is a red herring. It is more about observing that the CPU detected a need to flush its own instruction cache. – StilesCrisis Jan 26 '15 at 21:57
  • 1
    I added the call to flushinstructioncache before spawning my thread and it seems to work. Which is good but it confuses me. VirtualAlloc allocates space in the heap which I assumed wouldn't have caching problems. I can understand modifying something in my virtual memory space, that was potentially cached. But the I would think any reference into the heap would get updated since the heap is expected to be modified often? – Without Me It Just Aweso Jan 26 '15 at 22:06
  • Heap space can easily be reused. All it takes is anything in the OS or in your program calling `VirtualFree`. Suddenly the kernel says, "hey, I can reuse this!" Your `VirtualAlloc` happens and it gets reused. That's that. – StilesCrisis Jan 26 '15 at 22:43