9

I was working on a shared pointer (called Handle) implementation for my student project's game engine, and we ran into a bug that we couldn't explain. For some reason, at a certain point in our factory, there was an invalid internal pointer being passed to a factory through a handle, which was causing a crash during our archetype loading. We debugged through the process for hours, and broke any complex statements down to their most simple versions for easier debugging. I finally isolated the problem down to the Handle class' copy constructor. However, it still seemed like there was some intermediate step where the internal pointer was being freed. I read every article I could find on what could cause this problem, and didn't find anything. Finally, I decided to look at the disassembly and see if I could figure out what was happening. Here's the copy constructor without the disassembly.

template <class TYPE>
Handle<TYPE>::Handle(const Handle<TYPE>& rhs, bool weak) : m_weak(weak), m_ref_count(rhs.m_ref_count), m_ptr(rhs.m_ptr)
{
  if(!m_weak)
    ++(*m_ref_count);
}

This is the copy constructor with the disassembly.

template <class TYPE>
Handle<TYPE>::Handle(const Handle<TYPE>& rhs, bool weak) : m_weak(weak), m_ref_count(rhs.m_ref_count), m_ptr(rhs.m_ptr)
{
013FFF50 push         ebp
013FFF51 mov          ebp,esp
013FFF53 sub          esp,0CCh
013FFF59 push         ebx
013FFF5A push         esi
013FFF5B  push        edi  
013FFF5C  push        ecx  
013FFF5D  lea         edi,[ebp-0CCh]  
013FFF63  mov         ecx,33h  
013FFF68  mov         eax,0CCCCCCCCh  
013FFF6D  rep stos    dword ptr es:[edi]  
013FFF6F  pop         ecx  
013FFF70  mov         dword ptr [this],ecx  
013FFF73  mov         eax,dword ptr [this]  
013FFF76  mov         cl,byte ptr [weak]  
013FFF79  mov         byte ptr [eax],cl  
013FFF7B  mov         eax,dword ptr [this]  
013FFF7E  mov         ecx,dword ptr [rhs]  
013FFF81  mov         edx,dword ptr [ecx+4]  
013FFF84  mov         dword ptr [eax+4],edx  
013FFF87  mov         eax,dword ptr [this]  
013FFF8A  mov         ecx,dword ptr [rhs]  
013FFF8D  mov         edx,dword ptr [ecx+8]  
013FFF90  mov         dword ptr [eax+8],edx  
  if(!m_weak)
013FFF93  mov         eax,dword ptr [this]  
013FFF96  movzx       ecx,byte ptr [eax]  
013FFF99  test        ecx,ecx  
013FFF9B  jne         Handle<Component>::Handle<Component>+60h (013FFFB0h)  
    ++(*m_ref_count);
013FFF9D  mov         eax,dword ptr [this]  
013FFFA0  mov         ecx,dword ptr [eax+4]  
013FFFA3  mov         edx,dword ptr [ecx]  
013FFFA5  add         edx,1  
013FFFA8  mov         eax,dword ptr [this]  
013FFFAB  mov         ecx,dword ptr [eax+4]  
013FFFAE  mov         dword ptr [ecx],edx  
}

The problem (as far as I can tell) is here:

013FFF6D  rep stos    dword ptr es:[edi]

As everyone knows, this is used to clear memory quickly. However, this instruction is clearing the pointer that the rhs Handle's internal pointer-to-pointer points to, and not only that, but it's violating const, as the passed in Handle is a const reference. I'm not sure if it's a VS issue, some kind of flag that's set in our build settings that's causing it to happen, but the copy constructor works fine until we try to make archetypes with it.

Any help would be greatly appreciated!

EDIT 1: I have further researched the issue after some of the comments brought to light that I might be returning a temporary variable somewhere in my code. I wasn't able to find anything that would suggest I was doing so, but I did sift through some more disassembly to see if I could find any hints. Here is our CreateGenericArchetype function, which is where the problem is cropping up.

Handle<Object> ObjectFactory::CreateGenericArchetype(const std::wstring &ArchetypeName)
{
  /* Create The Object */
  auto archetype = mComponentFactoryMap.find(std::wstring(L"Object"));
  Handle<Component> created_component = archetype->second->CreateArchetypeComponent();
  Handle<Object> retVal = static_handle_cast<Object>(created_component);
  retVal->mArchetypeName = ArchetypeName;
  mArchetypeMap.insert(std::pair<std::wstring, Handle<Object>>(ArchetypeName, retVal));

  return Handle<Object>(retVal, true);
}

The line that's causing all of our problems (at the moment) is this:

Handle<Component> created_component = archetype->second->CreateArchetypeComponent();

Naturally, I extended my debugging to CreateArchetypeComponent as well, so here is that too:

Handle<Component> ComponentTypeFactory<Object>::CreateArchetypeComponent() const
{
  Object* created_object = new Object;
  Component* casted_object = static_cast<Component*>(created_object);
  Handle<Component> newComponent(casted_object);
  newComponent->mType = mType;
  newComponent->mID = GetNewID();
  return newComponent;
}

Object inherits from Component, so the static_cast downcast of the Object to Component is valid, and the constructor successfully constructs the Handle from the Component*. The issue crops up when we try to return the Handle we constructed inside the function. Here is the disassembly of that interaction:

return newComponent;
005602AD  push        0  
005602AF  lea         eax,[newComponent]  
005602B2  push        eax  
005602B3  mov         ecx,dword ptr [ebp+8]  
005602B6  call        Handle<Component>::Handle<Component> (04EA050h)  
005602BB  mov         ecx,dword ptr [ebp-110h]  
005602C1  or          ecx,1  
005602C4  mov         dword ptr [ebp-110h],ecx  
005602CA  mov         byte ptr [ebp-4],0  
005602CE  lea         ecx,[newComponent]  
005602D1  call        Handle<Component>::~Handle<Component> (04F2E7Bh)  
005602D6  mov         eax,dword ptr [ebp+8]  
}
00EB02D9  push        edx  
00EB02DA  mov         ecx,ebp  
00EB02DC  push        eax  
00EB02DD  lea         edx,ds:[0EB0318h]  
00EB02E3  call        @_RTC_CheckStackVars@8 (0E4BA9Eh)  
00EB02E8  pop         eax  
00EB02E9  pop         edx  
00EB02EA  mov         ecx,dword ptr [ebp-0Ch]  
00EB02ED  mov         dword ptr fs:[0],ecx  
00EB02F4  pop         ecx  
00EB02F5  pop         edi  
00EB02F6  pop         esi  
00EB02F7  pop         ebx  
00EB02F8  mov         ecx,dword ptr [ebp-10h]  
00EB02FB  xor         ecx,ebp  
00EB02FD  call        @__security_check_cookie@4 (0E4E9BAh)  
00EB0302  add         esp,130h  
00EB0308  cmp         ebp,esp  
00EB030A  call        __RTC_CheckEsp (0E41C3Dh)  
00EB030F  mov         esp,ebp  
00EB0311  pop         ebp  
00EB0312  ret         4  

*******back to CreateGenericArchetype*******

005C2639  call        __RTC_CheckEsp (04F1C3Dh)  
005C263E  mov         dword ptr [ebp-1D4h],eax  
005C2644  mov         ecx,dword ptr [ebp-1D4h]  
005C264A  mov         dword ptr [ebp-1D8h],ecx  
005C2650  mov         byte ptr [ebp-4],4  
005C2654  mov         edx,dword ptr [ebp-1D8h]  
005C265A  push        edx  
005C265B  lea         ecx,[created_component]  
005C265E  call        Handle<Component>::Handle<Component> (04EA050h)

*******copy constructor disassembly from above here*******

005C2663  mov         byte ptr [ebp-4],6  
005C2667  lea         ecx,[ebp-174h]  
005C266D  call        Handle<Component>::~Handle<Component> (04F2E7Bh)  
  Handle<Object> retVal = static_handle_cast<Object>(created_component);

Unless my copy constructor being called with the "created_component" as the rhs Handle is considered returning a local variable, I can't see where it could possibly be going wrong, unless it's because the returned Handle from CreateArchetypeComponent is on the stack when it's passed to the copy constructor and is then being cleared.

  • 3
    Looks like the `rep` instruction is just clearing out the stack and isn't touching either object. – Captain Obvlious Jun 01 '15 at 21:55
  • 2
    Upvoted for that rare object: a decent, well-researched question! –  Jun 01 '15 at 21:57
  • 1
    BTW, deleting a `const` pointer is not violating `const`. You can very well do `const int* const p = new int[42]; delete[] p;` And if you have a pointer to pointer in a `const` instance, then the first pointer is `const`, but not the data it points to (in this case the second pointer), so you can "clear" the second pointer. – vsoftco Jun 01 '15 at 21:58
  • 3
    I concur that it seems like it is just clearing your locals with 0xCC. This is standard function entrypoint code, and if it did not work correctly, absolutely nothing would work. So, it is highly unlikely that the problem is in that clearing code. Besides, the clearing code is there only because you have told the compiler to put it there via a compiler option. Try removing this option, and you should get the same behavior. (Though, if you don't get the same behavior, it may still not be the clearing code which is at fault; you may be accessing an uninitialized variable somewhere.) – Mike Nakis Jun 01 '15 at 22:11
  • 2
    You are compiling your code with either [/GZ](https://msdn.microsoft.com/en-us/library/hddybs7t.aspx) or [/RTCs](https://msdn.microsoft.com/en-us/library/8wtf2dfz.aspx) compiler option. Your `rep stos` opcode initializes uninitialized stack space (from ebp-0xCC to ebp) with a known pattern (0xCC). It doesn't touch any variables passed to the constructor. Those are passed above ebp. – IInspectable Jun 01 '15 at 22:21
  • @IInspectable You were completely correct, we were compiling with /RTC1 (/RTCsu) and when I changed it to /RTCu it fixed that issue. Thanks! – Blake A. L. Richardson Jun 01 '15 at 22:41
  • 2
    As other comments have mentioned, the `rep stos` instruction is to clear a scratch/locals area of the stack (though it doesn't look to me like that area is used later in the function, but maybe that 's the nature of non-optimized debug builds). If the object you're passing in lives in that area of the stack then it would appear that some other function is returning a pointer or reference to a local variable. If you find that's the case, then you might be able to fix things easily by having that function return an object copy instead. – Michael Burr Jun 01 '15 at 22:42
  • 2
    @BlakeA.L.Richardson: changing `/RTC1` to `/RTCu` is just making the problem not appear at the point you were seeing it before (the problem you were seeing is from the compiler adding debugging code that forces the bug to appear - it's not the bug). The root cause of the problem is almost certainly still there, you just aren't noticing it yet. – Michael Burr Jun 01 '15 at 22:44
  • 1
    The purpose of the 0xCC is to make your code break like this when you do an invalid memory access, instead of having the invalid access go unnoticed. It would be a good idea to keep hunting for the invalid access as it may come back to bite you in future (typically, when it's time to show someone else your program). – M.M Jun 01 '15 at 22:46
  • @MichaelBurr You're definitely correct, I changed the switch and it created a different issue. I can now successfully copy construct, but now my static_handle_cast (my implementation of static_pointer_cast) has an issue where declaring a Handle variable is causing the passed-in Handle to point to NULL. – Blake A. L. Richardson Jun 01 '15 at 22:55
  • 2
    You misinterpreted my comment. I merely explained, what the code in question does, and which compiler switches are responsible. @Michael has given an explanation, why you may have an invalid pointer, pointing to wiped out data: Returning the address to a local variable. You should definitely keep the runtime checks, and crank up your compiler warning level to 4. That may reveal the root cause. – IInspectable Jun 01 '15 at 22:59
  • Unfortunately I get no warnings from anything in my Handle class, nor anywhere the Handle class is used (our team has had warning level set to W4 for awhile). I'm posting some edits to the original post with some more information to see if it helps. – Blake A. L. Richardson Jun 01 '15 at 23:46
  • What is `static_handle_cast`? Since you know where the memory corruption is occurring, you should be able to work backward and find out who still has a pointer to that memory, and then work backward to how they got a pointer to freed stack memory. – Raymond Chen Jun 02 '15 at 00:38
  • @RaymondChen static_handle_cast is my implementation of static_pointer_cast, it's basically a way for me to up/down cast between Handle types whose internal pointers are able to be static_cast'd between. – Blake A. L. Richardson Jun 02 '15 at 00:55
  • "the static_cast downcast of the Object to Component is valid" That's a contradiction. Either it is an upcast, in which case you don't need `static_cast`, or it is a downcast, and invalid. – Ben Voigt Jun 02 '15 at 01:29
  • @BenVoigt Incorrect, sir. Object inherits **from** Component, so the downcast **is** valid. If I was trying to cast Component to Object via static_cast **that** would be invalid. – Blake A. L. Richardson Jun 02 '15 at 01:41
  • If `Object` inherits from `Component`, then you cannot downcast from `Object*` to `Component*`. You CAN *upcast* from `Object*` to `Component*`, and you can do so with an implicit conversion. Don't use cast syntax where it isn't necessary, it interferes with the compiler type checker. I can't tell whether you're confused about the meaning of "inherits from" or "downcast", but you keep contradicting yourself -- you can't be right about both. – Ben Voigt Jun 02 '15 at 01:59
  • @BenVoigt Yeah, I was. Sorry, terminology sometimes gets me confused. Doesn't it make more sense that downcasting is going toward the base class? I mean, it's the base. Base makes me think bottom, and toward the bottom is down. Sorry about that again. – Blake A. L. Richardson Jun 02 '15 at 02:04
  • 1
    Indeed, if you think about it that way it would be confusing. Upcast and downcast are terms from theoretical computer science, and they match the language of superclass and subclass (or subtype). Anyway, if you've ever seen a class hierarchy diagram, the base classes (superclass) are always placed at the top with their derived subclasses beneath. – Ben Voigt Jun 02 '15 at 02:09
  • So you see, everything you've read about downcasting safety and the `static_cast` vs `dynamic_cast` tradeoff is likely true... and inapplicable to this situation. You're trying to upcast, and for that you it is best to just let the compiler convert, no `dynamic_cast`, no `static_cast`, no `reinterpret_cast`. – Ben Voigt Jun 02 '15 at 02:10

1 Answers1

0

Thanks for all the great help with this issue, but we figured out what was going on. What was happening is that in some of the Handle constructors we needed to allocate memory for the pointer that the Handle's internal pointer-to-pointer was going to be pointing at. So what was happening was that the Handle itself was using a pointer that was held on the stack, and when the rep stos opcode was called, it was clearing out said pointer.

  • So you had a dangling pointer to an automatic variable that was going out of scope? – Ben Voigt Jun 02 '15 at 01:24
  • Kind of...the object that was being pointed at was new'd in the factory, so it was on the heap. However, the Handle itself uses a pointer-to-pointer, and the data holding the address that the Handle needed to point at was on the stack. When `rep stos` executed, it cleared the address from the stack. So the pointer to pointer was pointing to a location on the stack which itself was pointing to a location on the heap. – Blake A. L. Richardson Jun 02 '15 at 01:45
  • But the "location on the stack" (actually an automatic variable of pointer type) had already gone out of scope, correct? – Ben Voigt Jun 02 '15 at 01:53
  • Indeed, since it was contained within a Handle that we were returning from the CreateArchetypeComponent function. – Blake A. L. Richardson Jun 02 '15 at 02:12
  • 1
    I hope you fixed it by having the `Handle(T* obj)` constructor allocate a `new T*{obj}`, which gets deleted when the reference count hits zero. BTW, your `Handle` class doesn't appear to be storing enough metadata to support a correct implementation of `static_handle_cast`. Or to support weak pointers, you need to keep count of all handles (including weak) to decide when to delete the metadata, separate from the count of strong handles which controls the target's lifetime. – Ben Voigt Jun 02 '15 at 02:16
  • If you possibly can, you should switch to using `std::shared_ptr`. Supporting multiple ref-counted pointers of different type to (parts of) the same object, some of which may be weak, gets very complicated. – Ben Voigt Jun 02 '15 at 02:20
  • We were using it before, but we need to be able to run our game in debug mode and std::shared_ptr/std::weak_ptr and various other STL implementations are reducing our frame rate in debug to unplayable levels. Also, our Handle inherits from a class called HandleRegistrar which contains a static hashmap with all of the reference counts to all objects that currently have a Handle pointing at them. Each time we make a Handle, we check to see if it is already being tracked (the key is the address of the object) and if it is, we increment the reference count. – Blake A. L. Richardson Jun 02 '15 at 02:26
  • 1
    Ahh, the old "new project wizard sets up debug build that generates debug information and uses slow debug libraries and release build that uses fast libraries and optimizes your code so you can't debug". I suggest learning your way around the project options and using the release version of the library, but enable debug information and no optimizations in your own code. – Ben Voigt Jun 02 '15 at 02:30
  • Your `HandleRegistrar` sounds like it reinvents `enable_shared_from_this`... found an explanation here which seems pretty good: http://aldrin.co/esft.html – Ben Voigt Jun 02 '15 at 02:30