1

I was recently looking at string literals in c++, and I saw that string literals were just char arrays. Then, in many use cases (such as when printing the string literal) the array decays to a pointer to the first element of the char array, and the compiler keeps reading characters until it encounters a null terminator (\0), at which point the compiler knows the string has ended. This happens in the example below:

#include <iostream>

int main()
{
    char str[] = "This is a string.";
    std::cout << str;
}

However, this only makes sense when the string literal is stored somewhere in memory, like how in the example above it was stored in the variable str. When we do something like this:

#include <iostream>

int main()
{
    std::cout << "This is another string";
}

How can the string literal be printed without there being any array? Does the c++ compiler still initialize an array for the string literal and still store the string literal in memory, just without it being done manually? Or is the string printed somehow else?

jabaa
  • 5,844
  • 3
  • 9
  • 30
linger1109
  • 500
  • 2
  • 11
  • 3
    Yes, the string literal is still stored somewhere in memory, even if you don't assign it to a variable – UnholySheep Jul 24 '22 at 16:40
  • 3
    String literals are in fact stored in memory. A string literal is an object of type `const char[N]` (for a suitable value of `N`). You can print its address with `std::cout << (void*)"Your string literal";` – Igor Tandetnik Jul 24 '22 at 16:40
  • Thank you both so much, that clarifies it for me! – linger1109 Jul 24 '22 at 16:43
  • I removed the references to C. This is obviously C++. – jabaa Jul 24 '22 at 16:46
  • Related: [String literals: Where do they go?](https://stackoverflow.com/questions/2589949/string-literals-where-do-they-go) – Jason Jul 24 '22 at 17:06
  • Interesting fun fact: The behaviour of `std::cin >> str;` changes in C++20. the size of the array is inferred, preventing buffer overflows and the raw pointer-accepting overloads that allow for easy overflows has been removed. – user4581301 Jul 24 '22 at 19:28

1 Answers1

3

String literals need to be stored in your executable or library file. By far the most common way is a simple encoding just as in a text file, meaning there is no compression or any other transformation of strings within your executable. This allows the strings to have addresses in the executable which are similar to addresses of functions. These addresses have names, called "symbols." If you write the exact same literal string twice in the same program, it only needs to be stored once in the executable (deduplication can be done by the linker, so it works even across separate source files).

Similar to what happens with the executable code of your program, the operating system will copy your string literals from disk into memory when they are needed. This copying happens in "pages" so if you have a few string literals and one is read, others nearby will be read too, reducing total disk I/O.

Since string literals are not allowed to be modified at runtime, they can be stored in the .rodata section on platforms that have such a thing. That can mean that attempts to modify them will terminate the program (again, on some platforms).

John Zwinck
  • 239,568
  • 38
  • 324
  • 436