3

The line feed (LF) has an ASCII code number 10. Is '\n' implemented as something MORE than a simple non-printable ASCII character?

Are there any C language specific details?

Keith Thompson
  • 254,901
  • 44
  • 429
  • 631
Rasteril
  • 605
  • 1
  • 5
  • 16

1 Answers1

11

The character constant '\n' represents what the C standard calls the "newline" character.

The standard doesn't say what the value of that character is. It happens to be 10 (LF) on systems that use an ASCII-based character set (that includes Unicode) -- but the C standard doesn't require ASCII. (Starting with the 1999 standard, implementations may indicate that they support Unicode by predefining __STDC_ISO_10646__, but that's optional.)

IBM mainframe systems, for example, use a different and incompatible character set called EBCDIC. On such systems, '\n' will have a value other than 10.

Incidentally, '\10' has the value 8, not 10. That syntax uses octal (base 8), not decimal. Character 10 is represented as '\12' (or as '\xa' in hexadecimal). (The question was updated to correct this.)

Keith Thompson
  • 254,901
  • 44
  • 429
  • 631
  • So on a Linux system, the newline character should "unfold" into a simple LF non-printable character? – Rasteril Jul 31 '14 at 15:47
  • Note that there are platforms which even though they are ASCII-based use something other than 10 (LF) for `'\n'` (e.g. "classic" Mac OS used 13 (CR)). – Paul R Jul 31 '14 at 15:48
  • 1
    @Rasteril: On a Linux system, the character constant `'\n'` is essentially equivalent to the integer constant `10` (or `012`, or `0xa`). When printed to an output stream in text mode, it's converted to the system's representation for an end-of-line marker -- which, on Linux happens to be the character with the same value. (On Windows, printing `'\n`' has the effect of writing two characters, `CR` and `LF`.) – Keith Thompson Jul 31 '14 at 15:49
  • 1
    @PaulR: I know classic Mac OS used `CR` to mark end-of-line in text files -- but did C compilers actually use the value 13 for the constant `'\n'`, or was it converted on input and output? I suspect the latter, but I'm not sure. – Keith Thompson Jul 31 '14 at 15:50
  • @Keith: yes, IIRC most/all compilers converted both `\r` and `\n` to CR (13), as LF (10) was not really supported by the OS. One more data point: Windows generates CR + LF for `\n` (unless the file is opened in binary mode). – Paul R Jul 31 '14 at 15:52
  • @PaulR: That's not quite what I meant to ask. In a C program compiled and executed on classic Mac OS, would the expression `'\n' == 13` be true, or would `'\n' == 10` be true? The value of the constant `'\n'` and the character printed by `putchar('\n')` are separate questions. (And I don't think the C standard would allow `'\n' == '\r'`) – Keith Thompson Jul 31 '14 at 15:54
  • @Keith: good question - I don't know the answer I'm afraid and I don't have a classic Mac OS system any more to try it on. The only thing I can say for certain is that writing `'\n'` to a file generated an ASCII CR. – Paul R Jul 31 '14 at 16:02
  • 3
    FWIW, on modern systems with `__STDC_ISO_10646__` defined and `__STDC_MB_MIGHT_NEQ_WC__` not defined, `'\12'` and `'\n'` are required to have the same value, the value of the UCS/Unicode character `U+000A`. – R.. GitHub STOP HELPING ICE Jul 31 '14 at 17:01
  • @PaulR: I've posted that question about classic Mac OS [here](http://stackoverflow.com/q/25065940/827263). – Keith Thompson Jul 31 '14 at 18:35