16

Possible Duplicate:
std::string and its automatic memory resizing

I am just curious, how are strings stored in memory? for example, when I do this:

string testString = "asd";

it allocates 4 bytes, right? a + s + d + \0.

But later, when I want to assign some new text to this string, it works, but I don't understand how. For example I do this:

testString = "123456789"

Now it should be 10 bytes long. But what if there wasn't space for such string? let's say that fifth+sixth bytes from the beginning of string are taken by some other 2 chars. How does the CPU handles it? It finds completely new position in memory where that string fits?

Community
  • 1
  • 1
user1145902
  • 187
  • 1
  • 2
  • 7
  • 1
    Maybe not an exact duplicate, the original question already knew that the memory was dynamically allocated by the string, which I am not sure was the case here. Anyway I won't vote to reopen. – David Rodríguez - dribeas Feb 03 '12 at 18:06

4 Answers4

22

This is implementation dependent, but the general idea is that the string class will contain a pointer to a region of memory where the actual contents of the string are stored. Two common implementations are storing 3 pointers (begin of the allocated region and data, end of data, end of allocated region) or a pointer (begin of allocated region and data) and two integers (number of characters in the string and number of allocated bytes).

When new data is appended to the string, if it fits the allocated region it will just be written and the size/end of data pointer will be updated accordingly. If the data does not fit in the region a new buffer will be created and the data copied.

Also note that many implementations have optimizations for small strings, where the string class does contain a small buffer. If the contents of the string fit in the buffer, then no memory is dynamically allocated and only the local buffer is used.

David Rodríguez - dribeas
  • 204,818
  • 23
  • 294
  • 489
  • 2
    Well 3 pointers is 24 bytes (on 64 bit arch). That's a reasonable sized string if you just write over the pointers using their space as the buffer. – Martin York Feb 03 '12 at 17:23
  • 1
    @LokiAstari: You would have to add an extra flag somewhere to identify whether the memory is used as a buffer or not. But it is a fair assumption that it is *cheap* to reuse the pointers for a small buffer optimization. – David Rodríguez - dribeas Feb 03 '12 at 18:05
4

string is not a simple datatype like char *. It's a class, which has implementation details that aren't necessarily visible.

Among other things, string includes a counter to keep track of how big it really is.

char[] test = "asd";       // allocates exactly 4 bytes
string testString = "asd"; // who knows?

testString = "longer";     // allocates more if necessary

Suggestion: write a simple program and step through it using a debugger. Examine the string, and see how the private members change as the value is changed.

egrunin
  • 24,650
  • 8
  • 50
  • 93
2

string is an object, not just some memory location. It dynamically allocates memory as needed.

The = operator is overloaded; when you say testString = "123456789"; a method is being called and deals with the const char * you passed in.

Brian Roach
  • 76,169
  • 12
  • 136
  • 161
  • If I'm not mistaken, `std::string` is not an object. – jrok Feb 03 '12 at 17:12
  • so chars in string don't have to be ordered in memory byte by byte (like normal array) but they can be stored in some random location? – user1145902 Feb 03 '12 at 17:13
  • 3
    @jrok: It seems that you are mistaken... unless you actually mean that `std::string` is a type (instantiation of a template) from which objects are instantiated... – David Rodríguez - dribeas Feb 03 '12 at 17:16
  • @user1145902: They are stored in memory like in an array, but that memory is not allocated in the stack (or wherever the string object is), but rather in a dynamically allocated buffer. – David Rodríguez - dribeas Feb 03 '12 at 17:17
  • @DavidRodríguez-dribeas Yes, I do mean that. – jrok Feb 03 '12 at 17:17
  • @user1145902 - that would be implementation specific. THere's a number of approaches you could use - a linked list containing "chunks" of the entire string, for example, where additional nodes are created as needed. This would mean that indeed the entire string was not contiguous. – Brian Roach Feb 03 '12 at 17:19
  • @user1145902 As of last summer, the new standard C++11 requires that `std::string` stores its contents contiguosly. §21.4.1.5 – jrok Feb 03 '12 at 17:24
  • @BrianRoach: Actually the (draft) standard was tightened on that a many years ago. The string is required to store the char values in a contiguous block of memory. See [string.require]. This was because people were doing this `&data[0]` to get a C-string pointer as it made it easy to write template code that looked like an array (which included string). – Martin York Feb 03 '12 at 17:31
  • ok, so i don't have to worry about string size - it's handled automatically. thank you – user1145902 Feb 03 '12 at 17:31
  • @LokiAstari - I wasn't sure what the current standard defines, but really I was just trying to explain that an implementation wouldn't *have* to be contiguous even though AFAIK most have been. – Brian Roach Feb 03 '12 at 17:34
  • @BrianRoach: They did a survey before changing the rule in the standard and fund that all implementations used contiguous memory. So that was easy to push past the committee. So previously it was in the introduced in N2134 released Nov 2006. See http://stackoverflow.com/a/4653479/14065 definitely now part of the current standard N3337 – Martin York Feb 03 '12 at 17:47
2

It's stored with a size. If you store a new string, it will optionally deallocate the existing memory and allocate new memory to cope with the change in size.

And it doesn't necessarily allocate 4 bytes the first time you assign a string of 4 bytes to it. It may allocate more space than that (it won't allocate less).

Tom Tanner
  • 9,244
  • 3
  • 33
  • 61