9

I cannot replicate this problem in a smaller program that I can show so I will illustrate the question with screenshots.

I have a class that declares a union and a static member of the union:

class 
{ 
/* rest of the class */

    union EmptyString
    {
      char8   m_Empty8[1];
      uchar8  m_EmptyU8[1];
      char32  m_Empty32[1];
    };

    static const EmptyString sm_emptyString;
};

  // Definition
  template <typename T>
  const typename StringBase<T>::EmptyString 
    StringBase<T>::sm_emptyString = { 0 };

I then have a function that returns the address of the string. For debugging purposes, I have added the variable address so that I can add it to the watch window. Here is the function (T here is a char8 where char8 is a typedef for char):

  template <typename T>
  const char8* StringBase<T>::GetEmptyString( char8 )
  {
    const char8* address = sm_emptyString.m_Empty8;
    //const char8* address = &(sm_emptyString.m_Empty8[0]);
    return address;
    //return sm_emptyString.m_Empty8;
  }

As you can see, initially I had the one return statement, which I commented out so that I can debug the snippet. The other line that is commented out is taking the address of the first (and only) element of the array; this commented out line does the same thing as the line previous to it, which is: const char8* address = sm_emptyString.m_Empty8;.

The program crashes because I am not getting the correct address by the above function and this has stumped me. Below I will start screenshots of my debugging session:

Break before getting address

Above, you can see that I am breaking just before the address is copied to the pointer. Below is the watch window values for sm_emptyString and address at this break.

enter image description here

So far, nothing special. Below is the screenshot after stepping over one line of code.

enter image description here

And the corresponding watch window:

enter image description here

As you can see, the value copied to address is wrong. Another thing to note is that this function gets called a few times and the copy of the static variable address to address is correct. When the address is incorrect and is returned by the function, my program crashes (because of the assumptions about the address later on in the program).

What is happening here? Am I over-writing memory somewhere? How do I go about debugging this?

EDIT:

The disassembly (according to the debugger, the address of sm_emptyString is 129F184, different than what the disassembly shows which is 128F134):

template <typename T>
  const char8* StringBase<T>::GetEmptyString( char8 )
  {
0119B850  push        ebp  
0119B851  mov         ebp,esp 
0119B853  push        ecx  
0119B854  mov         dword ptr [ebp-4],0CCCCCCCCh 
    const char8* address = sm_emptyString.m_Empty8;
0119B85B  mov         dword ptr [address],offset StringBase<char>::sm_emptyString (128F134h) 
    //const char8* address = &(sm_emptyString.m_Empty8[0]);
    return address;
0119B862  mov         eax,dword ptr [address] 
    //return sm_emptyString.m_Empty8;
  }

What I am using: Windows 7, VC++ 2008, Debug build

Samaursa
  • 16,527
  • 21
  • 89
  • 160
  • 1
    What happens if you change the union members from 1 element arrays to just a simple variables? ie `union { char8 m_Empty8; ... }` and then assign address as `const char8* address = &(sm_emptyString.m_Empty8);`. Do you still get the same problem? – Andrew Tomazos Sep 23 '12 at 23:19
  • Also what happens if you change the union into a struct? (just replace union keyword with struct). Do you still get the same problem? – Andrew Tomazos Sep 23 '12 at 23:21
  • Added to that, try changing `char8 m_Empty8[1]` and friends to straight pointers. So `char8 *m_Empty8` instead. – Aatch Sep 23 '12 at 23:22
  • 1
    Did you actually **define** your `static` member or did you only declare it? Given that you neither show the definition nor have given the class containing the `union` a name this might be the problem. – Dietmar Kühl Sep 23 '12 at 23:22
  • @DietmarKühl if he didn't define it it wouldn't compile would it? – Seth Carnegie Sep 23 '12 at 23:26
  • It would compile but wouldn't link. – Andrew Tomazos Sep 23 '12 at 23:28
  • Yeah, I mean "it would not execute" – Seth Carnegie Sep 23 '12 at 23:31
  • @SethCarnegie: The compiler is allowed to complain if it doesn't find a definition. I don't think it is required to fail. I'm not saying that this is the issue but given that the class with this `static` member doesn't have a name, I can't see who it possibly **is** defined and I can imagine that this receives special treatment. – Dietmar Kühl Sep 23 '12 at 23:31
  • @DietmarKühl § 9.4.2.4 says _There shall be exactly one definition of a static data member that is odr-used (3.2) in a program; no diagnostic is required._ So you're right. – Seth Carnegie Sep 23 '12 at 23:38
  • @DietmarKuhl: I have updated the question with the definition of the static variable. – Samaursa Sep 23 '12 at 23:43
  • @AndrewTomazos-Fathomling: Changing the union to struct does not change the outcome. It still manages to grab the incorrect address. Again, just like before, the same function is called many more times and returns the correct address. I did not yet try to remove the other elements in the union as that requires a lot of code to be commented out. I will do so asap. – Samaursa Sep 23 '12 at 23:57
  • Does the incorrect address of `sm_emptyString`show up for all types of `T`? On a larger note, how/when does this static get initialized for each type of `T`? You would have to debug the CRT during the time it inits all statics. – Chris O Sep 24 '12 at 00:04
  • @Samaursa try looking at the assembly and see what it's doing. – Seth Carnegie Sep 24 '12 at 00:12
  • @Samaursa: It seems, you class is `template class StringBase`. The definition you gave doesn't bind any of the members of the `union`. Also, this is a template which needs to be instantiated: it will be instantiated in all translation units where it gets accessed and is visible. Put the definition into a `.cpp` file and explicitly instantiate it! Why is this a `union` anyway rather than a suitably specialized type with the static member defined in a suitable translation unit? – Dietmar Kühl Sep 24 '12 at 00:17
  • @SethCarnegie: Added assembly output which I looked at. Somehow the address it is grabbing is incorrect. – Samaursa Sep 24 '12 at 00:33
  • @DietmarKuhl: I don't understand what you mean by **definition doesn't bind any of the members of the union**. I agree that I can put the definition in the `.cpp` file but I am wondering why that should matter considering that defining static members of template class like I did is standard (or at least according to the answer here to the same question I asked some time ago: http://stackoverflow.com/q/6397330/368599)? Also, does `union` have issues in this kind of usage? – Samaursa Sep 24 '12 at 00:37

1 Answers1

2

Use a function? The empty string you need has nothing to do with StringBase<T> nor its state nor its template argument.

inline const char8* GetEmptyString( char8 )     
{
    static const char8 address[1] = {0};
    return address;
}

EDIT: Why i suggest it is that using union like your code uses is wrong. Some things that C99 standard say:

  • 6.2.5p20, union has an overlapping set of member objects
  • 6.7.2.1p14, the value of at most one of the members can be stored in a union object at any time
  • Annex J.1 the value of a union member other than the last one stored into is unspecified.

Same goes with C++, just don't have the standard handy now. So you can not use union to store 3 strings at same place at same time. While I am not sure what actually causes the problems, it is likely totally elsewhere, you should get rid of such questionable constructs.

Öö Tiib
  • 10,809
  • 25
  • 44
  • I do like the suggestion but it does not solve the original issue. If I get the proper address with this function, then I am just sweeping the bug (perhaps elsewhere and manifesting in the code given) under the rug (so to speak). – Samaursa Sep 24 '12 at 01:31
  • The original issue is that your code contains inevitable undefined behavior. You may not read the value of member of union that was not the last one what was written. Not in C not in C++. So i tried to suggest sane solution. – Öö Tiib Sep 24 '12 at 01:44
  • Ah, I did not know this was undefined behavior (+1). Actually, I do not know what 'this' means: _You may not read the value of member of union that was not the last one what was written_. I would appreciate if you could elaborate on that. – Samaursa Sep 24 '12 at 01:54
  • Ok i put some words of standard into my answer. – Öö Tiib Sep 24 '12 at 02:29