1

I am observing a std::string assignment operator (=) causing an access violation writing to the LHS. In MSVC++ debug mode, the LHS internal buffer points to an invalid address. I'm not familiar with the internals of MSVC++ std::string but I had previously assumed that the internal buffer pointer should never be invalid.

Using the Visual Studio debugger, the internal buffer to which I refer is the char[] instance member std::string::_Bx::_Buf. This usually holds the address of the null-terminated character string represented by the std::string object. It appears that std::string::_Bx._Ptr is also a char * pointer to this address.

I am encountering this frequently in certain circumstances but I can't determine how or when this address becomes invalid. Wouldn't the debugger alert me if something clobbered this value? Is there a way to set the Visual Studio debugger to pause when std::string::_Bx::_Buf is accessed for writing?

This is a scenario in which I cannot provide a SSCCE because I can't intentionally duplicate the error. The code that invokes the error is just a typical string value assignment in an instance mutator, like:

class MyClass {
protected:
    std::string myValue;
public:
    void setValue(std::string value) {
        myValue = value; // ACCESS VIOLATION from std::string::operator=()
    }
};

class OtherClass {
    static myFunc() {
        std::string myString("some value");
        MyClass *myClass = new MyClass();
        myClass->setValue(myString); // ACCESS VIOLATION from setValue()
    }
};

What could cause this? Has anyone seen this before? Any suggestions on where to look next?

taz
  • 1,506
  • 3
  • 15
  • 26
  • the code that you posted have the same issue too ? can we use it to check – qwr Jun 19 '13 at 16:01
  • @QWR No, that code should certainly not have the same issue. I just typed it out quickly so excuse any typos, etc. It's a toy example; see https://en.wikipedia.org/wiki/Mutator_method#C.2B.2B_example – taz Jun 19 '13 at 16:12
  • are you using std::string entire your project . or you pass char* in somewhere too . i suspect it can be cause of that . there is how to achieve explicit keyword behaviour on methods http://stackoverflow.com/questions/175689/can-you-use-keyword-explicit-to-prevent-automatic-conversion-of-method-parameter – qwr Jun 19 '13 at 16:18
  • Other parts of the project use `char *` but none of the code in question uses `char *`. Because `value` is passed by value, I would think it's not the culprit. Which means that `MyClass::myValue` is getting messed with somewhere, but the only place it's modified is from `setValue()`. I'm not sure that `explicit` or overloads applies here. – taz Jun 19 '13 at 16:26

1 Answers1

2

s._Bx._Buf is not a pointer, it's the internal small buffer std::string uses for holding small strings. This is called the small-buffer-optimization, or SBO. s._Bx is a union of the buffer and _Ptr, a pointer to the heap buffer that is allocated if the internal buffer is too small. So for small strings, s._Bx._Ptr should be invalid; after all, its storage is being used for the small string.

Anyway ... if you get an access violation, all is not well. In such cases, the most likely cause is that you accidentally messed with the std::string's memory, most likely due to some buffer overflow or use-after-free somewhere. It's not the assignment that's interesting, it's what happens before it.

Sebastian Redl
  • 69,373
  • 8
  • 123
  • 157
  • Thanks for the info about the internal buffers. How/where is "small" defined in this context? (Did you mean "So for small strings, _`s._Bx._Buf`_ should be invalid? edit: I understand that now) Is there a way to make the debugger alert me if I accidentally mess with the `std::string`'s memory? – taz Jun 19 '13 at 16:14
  • 1) "Small" is defined by the experiments MS engineers did to find the best size. There's probably some constant in the code. 2) No, I didn't mean that. It's `_Ptr` that should be invalid, because `_Buf` contains string data and it sits in the same memory. 3) http://stackoverflow.com/questions/621535/what-are-data-breakpoints – Sebastian Redl Jun 19 '13 at 16:16
  • To elaborate on 3), set a normal breakpoint just after you initially create the string, find out its address with the debugger, and set a data breakpoint there. – Sebastian Redl Jun 19 '13 at 16:17
  • Ok, thanks...is it correct that at any given moment, at least one of `_Buf` and `_Ptr` should _not_ be invalid? – taz Jun 19 '13 at 16:27
  • You were correct. I had missed that right before the error occurred there was a point at which the instance of `MyClass` (going by my example in the OP) could be deleted, which was the case in this scenario. The instance was deleted via a different pointer than the one in `myFunc()`. I am surprised that I was still able to call `myClass->setValue()` afterward, despite the fact that `myClass` referred to an address that had just been deleted, and it was not until the attempted access to the instance member's internal buffer that the access violation was issued. Thanks for the info on `string`! – taz Jun 19 '13 at 17:33
  • A non-virtual function call doesn't actually use the this pointer in any way until it accesses data members. While it's technically undefined behavior, it sometimes works to do `((Foo*)0)->something()` - in fact, the MFC used code like this back in the time, until they were forced to change it because a newer MSVC version optimized the code under the assumption that the null dereference cannot occur. – Sebastian Redl Jun 20 '13 at 12:58