0

Basically string is type of basic_string template class with char instantiation typedef basic_string string

Up to my knowledge, Here basic_string is class which contains some collections of some data members and member functions as similar like developer is creating one class.

Memseting on user defined class is not giving any issue(Exception for virtual class). So why it is causing issue in during memset on library class?

By exploring some links like this (memset structure with std::string contained), I found like it is not safe to apply memset on string because it is modifying internal data.

So my question is what kind of internal data is modified? Basically it can be POD’s or User defined data types. Is any other low level implementation is made inside string library.

Please note, I am not speaking about using the object after memsetting the class object, My entire concern is upon memsetting the object.

I know using memset on class is really bad idea, This is just for gaining the internal implementation knowledge.

Community
  • 1
  • 1
VINOTH ENERGETIC
  • 1,775
  • 4
  • 23
  • 38
  • How did you do it? Show the line of code. BTW Why did you need memset? – Mohit Jain Jun 22 '15 at 12:29
  • 2
    There's no way of knowing what "internal data is modified", because the data is *internal*. The C++ specifications doesn't say what data members a `std::basic_string` class needs, only the public functions and the behavior of those functions. Mixing old C functions with C++ objects is always going to be problematic, so just don't do it. Instead follow the [rule of three, five or zero](http://en.cppreference.com/w/cpp/language/rule_of_three) then you can just use plain assignment to "clear" a structure. – Some programmer dude Jun 22 '15 at 12:30
  • @Mohit Jain I am not going to use anywhere. Just to gain knowledge – VINOTH ENERGETIC Jun 22 '15 at 12:34
  • Some implementations have optimisations for short strings, for example, where the characters are stored in the memory pointer itself rather than in a heap allocation. Then of course the internal representation may not be what you expect. But anyway it's *not idiomatic* c++ so it isn't likely to work or be portable or stable. – Robinson Jun 22 '15 at 12:37
  • 1
    Dont gain internal implementation knowledge. The purpose of standards is to not require knowing how something is done to use it correctly. – UmNyobe Jun 22 '15 at 12:41

1 Answers1

4

So my question is what kind of internal data is modified?

There's no way to know for sure, because the class is opaque, but usually it is pointers that present the biggest concern.

I am not speaking about using the object after memsetting the class object

You don't have to use it in order for something bad to happen. Once you override the internal data with something else, there is a very good chance to break invariants expected by the destructor.

Consider an implementation of std::string that uses a pair of pointers to its string buffer, and see what happens here:

std::string a("hello, world!");
memset(&a, 0xaabbccdd, sizeof(std::string)); // We have a memory leak here
// string gets destroyed here --> undefined behavior

The first line allocates space for "hello, world!", and places pointers to it inside std::string. memset writes some junk on top of these pointers, creating the first problem: the memory allocated for the string is lost, creating a memory leak. However, the second problem is worse: now that it's time for the destructor to free the memory, the invalid pointers are passed to delete[], causing undefined behavior.

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
  • Upon memsetting it is giving core, the points you have mentioned is related to jobs done after memsetting – VINOTH ENERGETIC Jun 22 '15 at 12:43
  • 2
    @VinothKumar does it matter? anyway the destructor (ie job done after memsetting) will certainly be called after that statement, which means there IS an undefined behavior – UmNyobe Jun 22 '15 at 12:45
  • @VinothKumar The main point is that you cannot do it on any "live" object. Having the destructor run is part of the package that you get when you create an object: you do the setup with understanding that there's going to be cleanup later on. UB that happens during cleanup is still a UB. – Sergey Kalinichenko Jun 22 '15 at 12:50
  • he still can use memset only on the "string buffer" of the string `&a[0]`. Add that case to your answer for my upvote :P – CoffeDeveloper Jun 22 '15 at 13:16