1

I have the following code that randomly crashes my application,

for(map<_type, boost::shared_ptr<CRowHeaderEx<_type> > >::iterator itr = m_RowMap.begin(); itr != m_RowMap.end(); ++itr)
{
    boost::shared_ptr<CRowHeaderEx<_type> >  pRow = itr->second;
    time_t previoustime = pRow->get_DataReceived();
    if(currenttime - previoustime > Threshold)
    {
        listofdeletedkey.push_back(itr->first);
    }
}

The crash happens at end on the for loop in shared_ptr destructor. And this crash is random and not easily reproducible.

Exception : Unhandled exception at 0x00000752 in memory.hdmp: 0xC0000005: Access violation reading location 0x00000752.

Stack trace:

xxx.exe!boost::detail::sp_counted_base::release() Line 103  C++
xxx.exe!boost::detail::shared_count::~shared_count() Line 309   C++
xxx.exe!boost::shared_ptr<CRowHeaderEx<int> >::~shared_ptr<CRowHeaderEx<int> >()    C++
xxx.exe!CRowManagerEx<int>::PurgeRecords(int Threshold) Line 385    C++

And it crashes when the dispose() function is getting called in boost::detail::sp_counted_base::release().

void release() // nothrow
{
    if( BOOST_INTERLOCKED_DECREMENT( &use_count_ ) == 0 )
    {
        dispose();
        weak_release();
    }
}

disassembly:

        {
            dispose();
00412B57  mov         edx,dword ptr [this]  
00412B5A  mov         eax,dword ptr [edx]  
00412B5C  mov         ecx,dword ptr [this]  
00412B5F  mov         edx,dword ptr [eax+4]  
00412B62  call        edx  

edx value is here 0x00000752. that is causing the access violation.

AMIC MING
  • 6,306
  • 6
  • 46
  • 62
Krishna
  • 13
  • 1
  • 3

1 Answers1

0

And this crash is random and not easily reproducible.

Your program is experiencing some form of memory corruption. I believe my previous post would be useful about how to identify memory corruption using WinDBG/Pageheap on Windows platform.

https://stackoverflow.com/a/22074401/2724703

edx value is here 0x00000752. that is causing the access violation.

This indicates that, your are trying to access NULL pointer memory(with offset of +1874/0x752 byte). There could be several reason for this and it is not possible to understand all by looking at your current information.One of the reason could be your program is multi-threaded and some other thread is trying to release this shared memory concurrently with this thread.

EDIT

Following information can be found from boost documentation.

shared_ptr objects offer the same level of thread safety as built-in types. A shared_ptr instance can be "read" (accessed using only const operations) simultaneously by multiple threads. Different shared_ptr instances can be "written to" (accessed using mutable operations such as operator= or reset) simultaneosly by multiple threads (even when these instances are copies, and share the same reference count underneath.)

Any other simultaneous accesses result in undefined behavior.

Community
  • 1
  • 1
Mantosh Kumar
  • 5,659
  • 3
  • 24
  • 48
  • Thanks for the response. The application is multi-threaded. But all access to m_RowMap is protected by critical section. The code that throws the exception is in the smart ptr destructor. The underlying object is accessed through shared_ptr and never being accessed directly anywhere in the code. Should not shared_ptr maintain the reference counts and only delete the object when no code is referencing? – Krishna Apr 14 '14 at 23:19
  • @Krishna: I have updated my post regarding shared_ptr thread safety. You have mentioned that all access m_RowMap variable is protected by CS. But throwing the exception from destructor indicates that something serious has happened in your program. Well I still feel that there is some sort of memory corruption scenario is happening due to multithreaded nature of your program.Dynamic tool would be useful in these cases as I have mentioned. Well I have nothing to add on my post on this. – Mantosh Kumar Apr 15 '14 at 00:36
  • I agree that it is some sort of heap corruption. It has nothing to do with boost. I looked at the rowmap entries and objects stored in particular range of memory have corrupted values. Now the hard part is finding out which part of the code is corrupting the heap. Thanks for you help. – Krishna Apr 15 '14 at 19:11