There is no $
in your file. $
is a symbol used to indicate end-of-string in a regular expression (just like ^
means start-of-string). In a tool that operates one line at a time the end of the string it's working on is also the end of the line so often people using line-oriented tools mis-state $
as meaning end-of-line since in the context of that tool it's the same thing. $
is also used in other tools (e.g. cat -E
) as an end-of-line indicator.
Some terminology/definitions:
\r
is an escape sequence used in scripts to generate or match the
CR
(carriage-return) character ^M
(control-M), ASCII 13
\n
is an escape sequence used in scripts to generate or match the
LF
(line-feed) character ^J
(control-J), ASCII 10
$
is a regexp meta-character used in scripts to indicate end-of-string
(which often is also the end-of-line) and is also used by tools to indicate end-of-line
when displaying text.
\n
(i.e. LF
alone) is considered a newline in UNIX
\r\n
(i.e. CRLF
) is considered a newline in DOS (see Why does my tool output overwrite itself and how do I fix it?)
So when you do:
$ printf 'foo\n' | cat -vE
foo$
that does not mean there's a $
at the end of foo
, it's just cat
displaying a $
to show you where the end of the line is. When you do:
$ printf 'foo\r\n' | cat -vE
foo^M$
the ^M
(control-M) is explicitly showing you the CR
(carriage-return) character generated by \r
but the $
is not explicitly showing you the ^J
(control-J) character that the LF
(line-feed) generated by the \n
, instead it's specifically displaying a different character $
to show you the end of the line. If it DID show you ^J
s then everything would be concatenated on one line which would be tough to read. Consider the ease of reading this:
$ printf 'the\nquick\nbrown\nfox\n' | cat -vE
the$
quick$
brown$
fox$
vs if the output was this:
$ printf 'the\nquick\nbrown\nfox\n' | some_other_tool
the^Jquick^Jbrown^Jfox^J
You can never do either of these:
$ printf 'foo\nbar\n' | sed 's/$//' | cat -vE
foo$
bar$
$ printf 'foo\nbar\n' | sed 's/\n//' | cat -vE
foo$
bar$
to remove a LF since sed already consumed the LF when reading the input and the $
isn't itself the newline character, it's a metacharacter that lets you say in your regexp "match the end of the line" (in this case since the end of the input string IS the end of the line for sed by default).
You might ask - if sed consumed the LF when reading the input then why are there LFs at the end of each line of output? The answer is that sed adds a LF to every output line so that what it outputs is a valid POSIX text file (without terminating LFs you do not have a POSIX text file and so what any subsequent tool does with it is undefined behavior).
You can remove LFs, though, if you use a tool that does not read one line at a time. GNU sed has a -z
option to read NUL-separated text instead of LF-separated text and in that mode you can remove LF
characters:
$ printf 'foo\nbar\n' | sed -z 's/\n//' | cat -vE
foobar$
and now you can see how $
(the end-of-string metacharacter) is different from \n
(the escape sequence to match the LF character):
$ printf 'foo\nbar\n' | sed -z 's/$//' | cat -vE
foo$
bar$
$ printf 'foo\nbar\n' | sed -z 's/\n/<LF>/' | cat -vE
foo<LF>bar$
$ printf 'foo\nbar\n' | sed -z 's/$/<EOS>/' | cat -vE
foo$
bar$
<EOS>$
So the quick answer for "how do you remove LFs with sed?" is this with GNU sed:
$ printf 'foo\nbar\n' | sed -z 's/\n//g'
foobar$
and if you don't have GNU sed (or actually even if you do since the above will read the whole input into memory at once assuming a POSIX text file without NULs as input) then you should just use awk:
$ printf 'foo\nbar\n' | awk -v ORS= '1'
foobar$