0

First, this is not a duplicate of, e.g., How can I replace each newline (\n) with a space using sed?

What I want is to exactly replace every newline (\n) in a string, like so:

  • printf '%s' $'' | sed '...; s/\n/\\&/g'
    

    should result in the empty string

  • printf '%s' $'a' | sed '...; s/\n/\\&/g'
    

    should result in a (not followed by a newline)

  • printf '%s' $'a\n' | sed '...; s/\n/\\&/g'
    

    should result in

    a\
    

    (the trailing \n of the final line should be replaced, too)

A solution like :a;N;$!ba; s/\n/\\&/g from the other question doesn't do that properly:

printf '%s' $'' | sed ':a;N;$!ba; s/\n/\\&/g' | hd

works;

printf '%s' $'a' | sed ':a;N;$!ba;s/\n/\\&/g' | hd
00000000  61                                                |a|
00000001

works;

printf '%s' $'a\nb' | sed ':a;N;$!ba;s/\n/\\&/g' | hd
00000000  61 5c 0a 62                                       |a\.b|
00000004

works;

but when there's a trailing \n on the last line

printf '%s' $'a\nb\n' | sed ':a;N;$!ba;s/\n/\\&/g' | hd
00000000  61 5c 0a 62 0a                                    |a\.b.|
00000005

it doesn't get quoted.

Benjamin W.
  • 46,058
  • 19
  • 106
  • 116
calestyo
  • 327
  • 2
  • 7
  • Your last non-working example is just a copy of the last working one, btw. – Shawn May 29 '21 at 20:26
  • `but when there's a trailing` There is no trailing newline in the example presented.. – KamilCuk May 29 '21 at 20:43
  • @Shawn indeed... copy&paste error. I've corrected it. – calestyo May 30 '21 at 20:14
  • Perhaps look at the problem from a different angle. Using GNU sed `sed ':a;N;$!ba;s/$/\\/mg' file`? – potong May 31 '21 at 10:00
  • You mentioned in a comment wanting to stay POSIX compliant but none of your `printf` examples except the last one produce POSIX-compliant output since they don't have a terminating newline so what you're asking sed for is behavior that isn't defined by POSIX. – Ed Morton May 31 '21 at 21:55
  • @EdMorton true, but the printfs are just a convenient way for me to generate the input strings. In my use case the strings might come e.g. from a file. Also, it seems that sooner or later $'…' will be standardised in POSIX. – calestyo Jun 01 '21 at 01:36
  • @potong that doesn't seem to work as desired, e.g. when the input is $'a\n', the newline isn't quoted. OTOH, when the input is $'a\nb' a final \ is wrongly adde after the b. – calestyo Jun 01 '21 at 01:40
  • 1
    @calestyo I understand that but it's irrelevant - my point is that, however you generate it, a string without a terminating newline is not a valid text line and therefore not a valid text file per POSIX and so what any text-processing tool (e.g. awk and sed) does with that input is undefined behavior per POSIX and so saying `I'd like to stay POSIX compliant` isn't really applicable to what you're trying to do as, again, what you're trying to do simply isn't defined by POSIX. – Ed Morton Jun 01 '21 at 01:51

3 Answers3

3

Easier to use perl than sed, since it has (by default, at least) a more straightforward treatment of the newlines in its input:

printf '%s'   ''  | perl -pe 's/\n/\\\n/' # Empty string
printf '%s'   a   | perl -pe 's/\n/\\\n/' # a
printf '%s\n' a   | perl -pe 's/\n/\\\n/' # a\<newline>
printf '%s\n' a b | perl -pe 's/\n/\\\n/' # a\<newline>b\<newline>
# etc

If your inputs aren't huge, you could use

perl -0777 -pe 's/\n/\\\n/g'

instead to read the entire input at once instead of line by line, which can be more efficient.

Shawn
  • 47,241
  • 3
  • 26
  • 60
  • Thanks. I had found that to work myself. Just wanted to avoid another fork and do it all with sed. But it seems as (at least according to https://stackoverflow.com/a/67755634/6646161) that this isn't possible with sed at all. – calestyo May 30 '21 at 20:16
3

how to replace newline charackters with a string in sed

It's not possible. From sed script point of view, the trailing line missing or not makes no difference and is undetectable.

Aaaanyway, use GNU sed with sed -z:

sed -z 's/\n/\\\n/g'
KamilCuk
  • 120,984
  • 8
  • 59
  • 111
  • Thanks. I've kinda already suspected that this wouldn't work with sed at all, but just wasn't totally sure. And in principle I'd like to stay POSIX compliant, so -z is something I'd rather avoid. But using the perl suggestion above works fine. – calestyo May 30 '21 at 20:16
2

GNU awk can use the RT variable to detect a missing record terminator:

$ printf 'a\nb\n' | gawk '{ORS=(RT != "" ? "\\" : "") RT} 1' 
a\
b\
$ printf 'a\nb' | gawk '{ORS=(RT != "" ? "\\" : "") RT} 1'
a\
b$ 

This adds a "\" before each non-empty record terminator.

Using any awk:

$ printf 'a\nb\n\n' | awk '{printf "%s%s", sep, $0; sep="\\\n"}'
a\
b\
$ printf 'a\nb\n' | awk '{printf "%s%s", sep, $0; sep="\\\n"}'
a\
b$ 

Or { cat file; echo; } | awk ... – always add a newline to the input.