0

Please, I want to know the reality of how variables, arrays and strings are stored in memory in C language. Please rectify my understanding of memory in the example below:

In C, when we declare a string we proceed as the following:

char string[]="string";

What happens in memory? Is each character stored in a case of memory and each case of memory has its address?

For example, the address

1600 char[0]='S'
1602 char[2]='t'

and so on . Is this true? If not, please give me a correct schema of what really happens.

My second question is about C++: In C++, they invented a new type of data which is string. For example :

string variable("This is a string");

How is this text ("This is a string") stored in memory?

Johannes S.
  • 4,566
  • 26
  • 41
L.Zak
  • 298
  • 1
  • 13
  • The characters are stored as bytes one after the other in a continous memory block with a zero byte appended. char[0]='s' char[1]='t' ... char[6]=0. The c++ string type wraps the C interpretation in all compiler implementations I know. Meaning, you can get a standard C string array of characters from a string object. – scraatz Jan 25 '14 at 19:30
  • AFAIR, in C++ the expression `"string"` is still a 7-element character array. The `String` class, OTOH, packs much more functionality. – Javier Jan 25 '14 at 19:35

4 Answers4

3

Check out Strings in Depth:

[...] The exact implementation of memory layout for the string class is not defined by the C++ Standard. This architecture is intended to be flexible enough to allow differing implementations by compiler vendors, yet guarantee predictable behavior for users. [...]

[...] In C++, individual string objects may or may not occupy unique physical regions of memory, but if reference counting is used to avoid storing duplicate copies of data, the individual objects must look and act as though they do exclusively own unique regions of storage. [...]

You can find out how the string is implemented by your compiler as follows. For me (test under VS2010), it will be

string variable("This is a string");
printf("%p\n", &variable[0]);        // 006751C0
printf("%p\n", &variable[1]);        // 006751C1
printf("%p\n", &variable[2]);        // 006751C2
printf("%p\n", &variable[3]);        // 006751C3
printf("%p\n", &variable[4]);        // 006751C4
printf("%p\n", &variable[5]);        // 006751C5
... ...
Community
  • 1
  • 1
herohuyongtao
  • 49,413
  • 29
  • 133
  • 174
2

String are both stored as null terminated array of char (each char in ASCII code is 8-bit wide and in this way it occupies exactly one byte) so that they need one more byte to store the '\0' character ("hi" needs an array of three bytes) If you use std::string in c++ the memory used to store the data is the same but around it there is a wrapper that allows automatic memory managment, so that when you do:

string s("hello");
string t("world");
s+=t;

s is extended or even moved elsewhere in memory if that is necessary in order to find a longer consecutive array of bytes. In the same way when you call

s.c_str();

you get a pointer to the null terminated array of char contained into the s std::string class instance, that pointer is only a temporary pointer so that this

char* text=s.c_str();
s="very long string that probably results in a complete reallocation of the array";
printf("%s",text);

is likely to result in undefined behaviour (segmentation fault)

woggioni
  • 1,261
  • 1
  • 9
  • 19
0

they are stored in the form of bytes: char : 1 byte short : 2 bytes int : 4 bytes long : 4 bytes float : 4 bytes double : 8 bytes

Mojo Jojo
  • 173
  • 2
  • 11
  • 2
    How do you know that `short` is two bytes long? that `int` is 4? that `long` is always 4? Because they aren't. –  Jan 25 '14 at 19:29
  • 2
    The sizes of the basic types are implementation defined. On Windows (using Visual Studio) `long` is 32 bits on both 32 and 64 bit platforms, while using GCC a `long` is 32 bits on 32 bit platforms and 64 bits on 64 bit platforms. Also, while `sizeof(char)` is *defined* to return `1`, a `char` may actually be something else than a *byte* (8 bits). – Some programmer dude Jan 25 '14 at 19:29
  • 1
    ...and it doesn't even answer the question. – Karoly Horvath Jan 25 '14 at 19:34
  • While it is already said by previous comments: A detailed answer to [size of int, long, etc](http://stackoverflow.com/a/589684/1960455) – t.niese Jan 25 '14 at 19:40
0

Preface:

Every time you build (compile and link) your project, an executable image is created, divided into three sections:

  • Code-Section

  • Data-Section

  • Stack

Now will relate to four different scenarios:

Scenario #1 - a local array (declared inside a function):

int func()
{

    char array[]="string";
    ...
}

After you load and run the program, every time the function is called, 7 bytes located somewhere on the stack are initialized with the following values: 's', 't', 'r', 'i', 'n', 'g', 0. The address on the stack where these bytes are located, is the value of the SP register when the function is called. Hence, each time the function is called, these 7 bytes may reside in a different address in memory.

Scenario #2 - a global array (declared outside a function):

char array[]="string";

During compilation (before you load and run the program), 7 bytes in the data-section are set with the following values: 's', 't', 'r', 'i', 'n', 'g', 0. These bytes are "hard-coded" into the executable image, and once it is loaded into memory (i.e., whenever you run the program), they reside in the same address throughout the execution of the program.

Scenario #3 - a pointer to a local array (declared inside a function):

int func()
{
    char* array="string";
    ...
}

Same as scenario #2, except for the fact that these 7 bytes are located in the code-section (which is a read-only section), instead of in the data-section (which is a read/write section). In addition, a pointer (array) is allocated on the stack and initialized (set to point to the address of "string" in the code-section) every time the function is called. The size of this pointer is typically 4 bytes when using 32-bit RAM, or 8 bytes when using 64-bit RAM.

Scenario #4 - a pointer to a global array (declared outside a function):

char* array="string";

Same as scenario #3, except for the fact that the pointer (array) is located in the data-section instead of in the stack and initialized once (during compilation) instead of every time the function is called (during execution).

barak manos
  • 29,648
  • 10
  • 62
  • 114
  • "Every time you build (compile and link) your project, an executable image is created" - at least on some implementations, but this is not universally true, and it does not contribute to the essence of the answer. I suggest you remove this (and the related parts) and re-word your answer in a manner that describes abstract behavior better. –  Jan 26 '14 at 21:09