0

I'm new to Linux sorry if my question sounds dumb.

We know that Linux and Mac OS X use \n (0xa), which is the ASCII line feed (LF) character. MS Windows and Internet protocols such as HTTP use the sequence \r\n (0xd 0xa). If you create a file foo.txt in Windows and then view it in a Linux text editor, you’ll see an annoying ^M at the end of each line, which is how Linux tools display the CR character.

Bu why Linux tools display the CR character as ^M? as my understanding is, \r (carriage return) is to move the cursor in the beginning of the current line, so the sensible approach to display it is like, when you open the file, you see the cursor is in the beginning of the line(that contains \r), so ^M shouldn't be displayed?

PS: some people post answers that how to remove ^M, but I wnat to know why eventually^M is displayed rather than moving the cursor in the beginning, which is the definition of carriage return.

  • 2
    Related cross-dupe in SuperUser: [Why are special characters such as “carriage return” represented as “^M”?](https://superuser.com/q/763879/950877) – Gino Mempin Oct 01 '20 at 00:32
  • See [What is `^M` and how do I get rid of it?](https://unix.stackexchange.com/q/32001/197080) – David C. Rankin Oct 01 '20 at 00:32
  • Which editors are you seeing this in? Neither Vim, Emacs nor Nano do this when they detect that a file has DOS line terminators. (Vim and Emacs do it when the file has mixed line terminators, but in that case you'd obviously want to know) – that other guy Oct 01 '20 at 00:33

1 Answers1

1

The ASCII control characters like TAB, CR, NL and others are intended to control the printing position of a teletypewriter-like display device.

A text editor isn't such a device. It is not appropriate for a text editor to treat a CR character literally as meaning "go to the first column"; it would make a confusing gibberish out of the editing experience.

A text editor works by parsing a text file's representation, to create an internal representation which is presented to the user. On Unix-like operating systems, a file is represented by zero or more lines, which are terminated by the ASCII NL character. Any CR characters that occur just look like part of the data, and not part of the line separation.

Not all editors behave the same way. For instance, the Vim editor will detect that a file uses CR-LF line endings, and load it properly using that representation. A flag is set for that buffer which indicates that it's a "DOS" file, so that when you save it, the same representation is reproduced.

That said, there is a feature in the actual Linux kernel for representing control characters like CR using the ^M notation. The TTY line discipline for any given TTY device can be configured to print characters in this notation, but only when echoing back the characters received.

Demo:

$ stty echoctl # turn on notational echo of control characters
$ cat # run some non-interactive program with rudimentary line input
^F^F^F^F^F^F
^C
$

Above, the Ctrl-F that I entered was echoed back as ^F. So, in fact there is a "Linux editor" which uses this notation: the rudimentary line editor of the "canonical input mode" line discipline.

Kaz
  • 55,781
  • 9
  • 100
  • 149
  • Thanks for your answer. so let's say a file has three lines, and I want the behaviour as: when I open the file, the cursor should be in the beginning of the second line, so which control character should I use when store the file? –  Oct 01 '20 at 00:38
  • @slowjams There is nothing in a file to control an editor that way. An editor which simply dumps the data to the display so that the cursor ends up moved around would not be very useful; an editor must protect the user from such effects. – Kaz Oct 01 '20 at 00:41
  • @slowjams You can prepare a file which, if you `cat` it to the terminal, will control the cursor. You can certainly make a three-line file such that when this is dumped to the terminal, the cursor will be at the beginning of the third line. You can do that via an embedded CR. – Kaz Oct 01 '20 at 00:42
  • @slowjams A line feed (LF) normally moves the cursor to the start of the next line; i.e. it performs the line feed *and* the carriage return action. This is not specified by ASCII; it's a Unix TTY behavior. That behavior is not hard-coded either; it a default which is configured by TTY flags (`stty onlcr`). It goes hand in hand with files using LF as a line terminator in text files. If we dump a text file to the terminal, it looks right because the line-feeds are transformed to CR-LF by the TTY driver before being sent to the terminal emulator, thanks to that `stty onlcr` setting. – Kaz Oct 01 '20 at 00:43
  • If you do `stty -onlcr` and `cat` a file, you will see a "staircase" because the linefeed after each line now just moves the cursor directly down, without moving it to the start of the line. – Kaz Oct 01 '20 at 00:55
  • The `echoctl` setting acts separately from the "canonical mode discipline" of a terminal. Just try it `stty -icanon; cat; stty icanon`, then enter your Control-Fs. –  Oct 01 '20 at 04:24
  • Unless you call the entire terminal driver ("discipline") an "editor" just for the sake of the argument ;-) –  Oct 01 '20 at 04:26