40

(a)

string str = "Hello\nWorld";

When I print str, the output is:

Hello
World

(b)

string str;
cin >> str;      //given input as Hello\nWorld

When I print str, the output is:

Hello\nWorld

What is the difference between (a) and (b)?

Boann
  • 48,794
  • 16
  • 117
  • 146
kranti sairam
  • 526
  • 4
  • 7
  • On Linux, you can `echo -e "Hello,\nWorld" | ./testprogram` to convert the escape character to a newline rather than the literal tokens `\` and `n`, before passing it to the program. – Davislor Aug 16 '18 at 05:33
  • [c++ - Why is "using namespace std" considered bad practice? - Stack Overflow](https://stackoverflow.com/questions/1452721/why-is-using-namespace-std-considered-bad-practice) ... – user202729 Aug 17 '18 at 09:32

7 Answers7

58

The C++ compiler has certain rules when control characters are provided - documentation. As you can see, when you specify \n in a string literal it is replaced by the compiler with a line feed (value 0xa for ASCII). So instead of 2 symbols, \ and n, you get one symbol with binary code 0xa (I assume you use ASCII encoding), which makes the console move output to a new line when printed. When you read a string the compiler is not involved and your string has the actual symbols \ and n in it.

Slava
  • 43,454
  • 1
  • 47
  • 90
  • 8
    Exactly right. The escape codes are a feature of the **source code parser**, not a feature of the C++ string data. – Euro Micelli Aug 15 '18 at 23:02
  • 2
    The language definition does not require any particular value for `’\n’`. – Pete Becker Aug 16 '18 at 02:15
  • 1
    @PeteBecker: The answer says the value it gives is for ASCII, which covers almost all modern SBCS systems. People working with other character sets will have to find the applicable value themselves. – Ben Voigt Aug 16 '18 at 13:02
  • @PeteBecker I did not mean that language requires, I just assumed that OP uses PC with ASCII encoding, edited answer to clarify. – Slava Aug 16 '18 at 13:07
  • 1
    @BenVoigt -- there is no inherent connection between character encoding and the value of `'\n'`. `0x0A` is convenient, because that's a value that isn't otherwise used. On Windows, `'\n'` is turned into two ASCII values on output to a terminal that uses ASCII. Internally the value of `'\n'` could be -2 and it would work just as well. Granted, back in the olden days, `'\n'` and `'\r'` mapped directly into ASCII codes, but that's an implementation detail, not a requirement. – Pete Becker Aug 16 '18 at 13:22
  • 3
    @PeteBecker: On Windows, `"\n"` is `{ 0x0A, 0x00 }` as well. It's the file (or console) I/O that translates it to 0D 0A... for the purposes of the C++ program it is just one. You could have a system where `'\n'` is -2, sure, but that would not be an ASCII system. The standard guarantees that character escapes are converted to the corresponding values from the execution character set. If the execution character set is ASCII, `'\n' == 10`, guaranteed. – Ben Voigt Aug 16 '18 at 13:36
  • 1
    @BenVoigt -- "On Windows, `"\n"` is `{ 0x0A, 0x00 }`" that's the **behavior** of the compiler you used. The runtime library that comes with the compiler gets linked into your program. It translates whatever character the compiler uses to represent `'\n'` into `0x0D 0x0A` and sends it to standard out, because that's what Windows requires to start a new line. `0x0D 0x0A` are, indeed, the ASCII codes for CR and LF. If you look at the requirements in the C++ standard, you'll see "new-line NL(LF) \n" in the table of character escape sequences. "NL(LF)" is not the name of a character code. – Pete Becker Aug 16 '18 at 14:47
  • 2
    @PeteBecker: LF is the name of a character code. The parentheses aren't part of the code name, they indicate additional information. But my main point is this: `'\n'` is always exactly one character. The compiler must not convert it to a CR-LF pair. If the execution character set is ASCII, then the one character `\n` has to be 0A. If the execution character set is EBCDIC, then obviously it will be different. But newline translations have absolutely nothing to do with string literals. – Ben Voigt Aug 16 '18 at 15:42
  • @PeteBecker: newline translations only happen on input/output, and only on files opened in text mode. They have nothing to do with internal string represantations. – Roel Schroeven Aug 17 '18 at 07:33
14

When specified in a string literal, "\n" will be translated to the matching ascii code (0x0a on linux), and stored as-is. It will not be stored as a backslash, followed by a literal n. Escape sequences are for you convenience only, to allow string literals with embedded newlines.

On the other hand, your shell, running in the terminal, does not do such substitution: it submits a literal backslash and n, which will be printed as such.

To have a newline printed, enter a newline:

$ echo "Hello
 World" | ./your-program 
erenon
  • 18,838
  • 2
  • 61
  • 93
  • 1
    While implementation of `echo` varies from computer to computer, `echo -e "hello\nworld"` is a pretty reliable way to get `\n` to be interpreted as a newline. If not, then `printf "hello\nworld\n"` is guaranteed to work if you have `printf` on your machine. – BallpointBen Aug 15 '18 at 20:20
  • 2
    Not the ASCII code, the "execution charset" (character encoding)—almost certainly not ASCII (even if, for this character, the code might be the same). – Tom Blodget Aug 15 '18 at 23:14
  • In particular, there are still computers running where the execution charset is EBCDIC, and '\n' is nothing like 0x0a. – Martin Bonner supports Monica Aug 16 '18 at 10:54
  • It might be good to know how to enter a literal newline from a terminal: in most terminals, press Ctrl+V followed by Ctrl+J. – mindriot Aug 16 '18 at 21:01
  • @BallpointBen, `printf` (the command-line utility) is standard, so you're very likely to have it (probably built-in to the shell). `echo -e` is quite common, but notably Dash (Debian's/Ubuntu's `/bin/sh`) doesn't treat that kindly. Then there's the `$'..'` quoting that does process C-style escapes. It's not standard and so dash doesn't support it either, but almost all other shells do support it. – ilkkachu Aug 16 '18 at 23:52
  • In the latter case, note that `std::cin >> str` where `str` is a `std::string` only reads 1 word (space-delimited token) from standard input. – user202729 Aug 17 '18 at 09:39
10

The string on cout<<"Hello\nworld" it's converted by the compiler to a compiled string where escape codes are converted to characters, so the cout function when executed does not see a two char "\n" string but the equivalent code for next line character.

But the cin gets the string of every typed character at runtime and does not convert escape codes. So if you want to convert those escape codes you have to make a replace function.

Krishna Choudhary
  • 615
  • 1
  • 5
  • 15
6

cin does not include a C++ compiler. Escape sequences in string literals are a feature of C++'s lexer, which is part of the C++ compiler. Streams more or less give you what came from the OS (they may do some CRLF -> CR translation or similar based on the OS, but that's it).

Petr Skocik
  • 58,047
  • 6
  • 95
  • 142
4

Why escape characters are not working when I read from cin?

Because stream readers are defined to be that way. At the core, each character is read separately. Only higher level functions provided additional meaning to the characters.

When a compiler processes the string literal "Hello\nWorld", its file reader passes it two characters too. Only the C++ compiler/parser translates them into one character based on the rules of the language.

R Sahu
  • 204,454
  • 14
  • 159
  • 270
4

In compiled code the character literal ’\n’ is replaced by an implementation-specific value that the runtime system treats as a newline character. The language definition does not require any particular value.

When reading input from the console or a file the incoming text is not being compiled, and the character sequence “\n” does not have any special meaning. It is simply two characters.

Pete Becker
  • 74,985
  • 8
  • 76
  • 165
3

Escape characters in a string are interpreted by the compiler. The sequence \n consists of two actual characters, which the compiler converts to a single newline character during compilation. The same sequence is not interpreted in any way when you enter it on the command line, so results in the exact two characters that you entered.

If you want to process your string to interpret escape sequences, you will have to do it yourself (or use an appropriate library).

Mad Physicist
  • 107,652
  • 25
  • 181
  • 264