1

In particular I'm trying to transform all \r\n to \r\r\n. This is because iCloud's IMAP server sends \r\r\n breaking the protocol and all sensibility (my only working theory is they did this so they would only work with their own IMAP client on release some years ago), and I need to write unit tests to simulate this.

It's remarkably tricky getting this to work in standard unix tools because of how they deal with line endings.

sed 's/\r\n/\r\r\n/g' - nope, does nothing

sed 's/\r/\r\r/g' - also does nothing

tr doesn't do much good in dealing with strings; it only operates on single characters and seems to preserve # of characters.

I'm not actually sure how to use Unix tools to do something this low level. Worst-case I can do this in a few lines of C but I'd like to learn how to do this more standardly.

Per discussion in Jim's answer, the version of sed on Mac OS X (BSD) seems to behave differently from Linux. Ideally I need a Mac solution although I can more or less get this done on a different machine.

djechlin
  • 59,258
  • 35
  • 162
  • 290

4 Answers4

3

If you're using bash as your shell, you can use its ANSI C quoting feature to force Mac OS X sed to work as you need.

sed -e $'s/$/\r\r/'

The $'...' is an ANSI C quoted string. The majority (just) of the characters within are not changed; the two \r sequences are replace by a carriage return in the string.

For example:

$ sed -e $'s/$/\r\r/' genouterr.sh | odx
0x0000: 23 21 2F 62 69 6E 2F 62 61 73 68 0D 0D 0A 66 6F   #!/bin/bash...fo
0x0010: 72 20 69 20 69 6E 20 7B 30 31 2E 2E 35 30 7D 0D   r i in {01..50}.
0x0020: 0D 0A 64 6F 0D 0D 0A 20 20 65 63 68 6F 20 22 73   ..do...  echo "s
0x0030: 74 64 6F 75 74 20 24 69 22 0D 0D 0A 20 20 65 63   tdout $i"...  ec
0x0040: 68 6F 20 22 73 74 64 65 72 72 20 24 69 22 20 3E   ho "stderr $i" >
0x0050: 26 32 0D 0D 0A 64 6F 6E 65 0D 0D 0A               &2...done...
0x005C:
$

The hex dump (odx is a home-brew program but I like its format) shows that there are two \r (0D) bytes before each newline (0A) which were not there in the original. Clearly, the choice of hex dump program doesn't affect the effectiveness of the sed command and the ANSI C quoting mechanism.

If you needed to change CRLF to CRCRLF, then you'd use:

sed -e $'s/\r$/\r\r/'

If you wanted to remove carriage returns, but only at the end of a line, then you could use:

sed -e $'s/\r\r*$//'

(tr can be used to remove all carriage returns, but not only those that precede a newline.)

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
2

'sed' on MacOSX has slightly different behavior than on linux. You may want to try instructions from this source.

sed -e 's/ /\'$'\n/g'

which adds a new line.

There is another option to use 'gsed', which is a more modern version of sed (comparable to linux). There you can probably use the linux solution: sed 's/\r\n/\r\r\n/g'

philshem
  • 24,761
  • 8
  • 61
  • 127
  • You want to include information from the linked article. http://meta.stackexchange.com/a/183670/183887 – djechlin Aug 20 '13 at 20:32
1

You can use the end-of-line anchor character '$' to accomplish what you want:

% od -c foo
0000000   l   i   n   e   1  \r  \n   l   i   n   e   2  \r  \n   l   i
0000020   n   e   3  \r  \n
0000025
% sed 's/\r$/\r\r/g' < foo > bar
% od -c bar
0000000   l   i   n   e   1  \r  \r  \n   l   i   n   e   2  \r  \r  \n
0000020   l   i   n   e   3  \r  \r  \n
0000030

The above works on GNU sed, but not BSD sed (which doesn't treat \r as one would expect in the replacement string). On a Mac or other BSD-ish sed variant, you should be able to accomplish the desired replacement by specifying a backslash-escaped literal (whitespace) ASCII return character.

See this question for further details.

Community
  • 1
  • 1
Jim Lewis
  • 43,505
  • 7
  • 82
  • 96
  • This didn't work - when I run that and then run the output through `od -c` I see `\r \n`, not `\r \r \n`. – djechlin Aug 20 '13 at 20:05
  • Operating system must matter here since I don't get that behavior. I'm on Mac OS X. – djechlin Aug 20 '13 at 20:07
  • @djechlin: It works for me, on Linux. I added the before/after `od -c` output. – Jim Lewis Aug 20 '13 at 20:08
  • I ran this on a Linux machine and it worked. I guess I can deal with it there. I'll upvote this and amend the question to ask for Mac solution. – djechlin Aug 20 '13 at 20:08
1

One way to do this on OSX is by using awk:

awk '/\r$/ {printf "%s\r\n", $0}' file

If you want sed only then this should work on OSX:

sed -i.bak "s/"$'\r'"$/&&/" file
anubhava
  • 761,203
  • 64
  • 569
  • 643