2

I'm at a loss to explain what seems to be mis-addressing within the char[] arrays of a C++ class I've inherited. I'm using Visual Studio 2005 and have narrowed the problem down to this:

MyClass.h:

class MyClass {
public:
  MyClass();
  virtual ~MyClass();
  void Reset();
  // More member functions. . .

  char m_szString[128];
  // More member variables. . .
}

MyClass.cpp:

MyClass::MyClass() { Reset(); }
MyClass::Reset()   { m_szString[0] = 'X'; }
// . . .

As I single-step through the program, I find that the Reset() function actually sets m_szString[4] to 'X' — not m_szString[0] as I expected. According to the watch window, the only element in the class before m_szString[] is a pointer to the vftable, __vfptr, which happens to be 4 bytes.

If I add more member variables to MyClass, subsequent strings are mis-addressed by various, always-increasing multiples of 4 bytes. Not simply aligned to 4-byte boundaries, but actually offset by multiples of 4. It's as if the compiler is skipping twice the necessary space for each vftable ... but that's purely a guess.

Somewhat similar problems have been reported (Google, MSDN), but I haven't found any answers.


Additional Information / Partial Solution: The class is the only member variable of a wrapper class that becomes a DLL. The parent was originally declared as

class ATL_NO_VTABLE CWrapperClass

Removing ATL_NO_VTABLE fixed the alignment problem.

I'm still seeing buffer overflows, but those should be fairly easy to track down.

Can you explain ATL_NO_VTABLE in terms understandable to an embedded C developer who's had very limited experience with COM beyond BSTRs, or better yet, provide a pointer (sorry) to a good reference?


Still More: This question provides some helpful debugging information.

Thanks for your help.

Community
  • 1
  • 1
Adam Liss
  • 47,594
  • 12
  • 108
  • 150

4 Answers4

1

Two questions here:

  • What's the problem? I don't mean what seems fishy in the debugger, but what actually goes wrong at the top level? What you're seeing in the debugger could be the result of optimizations, rather than being an actual error.
  • Do you have some minimal code that reproduces the error? The above doesn't really tell me much. If I'd run into exactly the same problem previously, it might be useful, but I haven't. So if I wanted to reproduce the problem, what should I do?
jalf
  • 243,077
  • 51
  • 345
  • 550
  • Thanks for your reply! 1st symptom was the debugger reporting a buffer overrun as it returned from a function that uses this class. Further inspection showed the char[] arrays weren't being set as expected, which ultimately led to the 'off by 4' observation. All optimizations are disabled. – Adam Liss Mar 11 '09 at 17:30
1

My first guess would be compiling with unintended optimizations turned on.

My second guess would be that something strange is going on with unicode support (e.g. char vs wchar_t).

Dashogun
  • 454
  • 5
  • 11
  • Thanks for your thoughts. The problem was definitely ATL_NO_VTABLE, probably related to COM. Writing a string of chars to the array does in fact write the correct chars to consecutive elements; it's only the offset that was wrong. I've quadruple-checked all optimizations -- they're off. – Adam Liss Mar 11 '09 at 20:42
0

The Answer

The compiler was, in fact, calculating an incorrect address for some of the class members. The culprit turned out to be a #pragma pack 1 directive hidden in an obscure header file that was #included in some of the source files.

How I found it

50% persistence, 50% luck, and 10% math skills. :-)

I came across a very simple class with few methods and members, most of which were defined in the header file, including:

typedef int BOOL;  // Legacy, makes my skin crawl; don't ask.
BOOL isConnected() { return m_bConnected; };
BOOL m_bConnected;

...and the function returned false when I knew  m_bConnected was true:

  • The watch window showed the correct value. Programmatic changes to m_bConnected were reflected in the window.
  • A watch on &m_bConnected showed that it began 8 bytes into the class.
  • The raw memory window showed the correct value. Programmatic changes to m_bConnected were reflected there as well.
  • The 4 bytes before m_bConnected were all 0, which I interpreted to be padding.
  • Stepping the debugger through the code clearly showed a return value of false!

So I checked the disassembly window and found this (comments are mine):

mov    eax,dword ptr [this]        ; address of the class instance
mov    eax,dword ptr [eax+4]       ; offset to m_bConnected

In other words, the compiler calculated the offset as 4, when it should have been 8.

Aha!

Interestingly, when I removed the definition of isConnected() from the header and placed it into my source file, the address was calculated correctly! That convinced me that it was, indeed, a packing problem, and a search of the code base for #pragma pack quickly identified the culprit, an ancient header file that (legitimately) required alignment on 1-byte boundaries but never reset the packing to the default.

The Fix

The solution was as easy as enclosing the offending header like this:

#pragma pack(push)
// original code here
#pragma pack(pop)

Thanks for your interest. And Sara, if you're reading, I'm about to dance on my desk!

Community
  • 1
  • 1
Adam Liss
  • 47,594
  • 12
  • 108
  • 150
-2

Its always better to null terminate the string before you start using . By doing so even while debugging you will have a cleaner view of the array. More ever you have declared m_szString in public . Its better you declare the array in the private area and return it using a member function

sameer karjatkar
  • 2,017
  • 4
  • 23
  • 43
  • True, but all irrelevant with respect to the question. Regardless of how any of the elements is declared or initialized, the indexing should always be correct. This is not a question about best practices; it's a question about the way the compiler computes addresses. – Adam Liss Mar 12 '09 at 14:07