3

I used

sed 's/\r\n$//' inputFile

however it didn't work. I don't know why

I also tried

awk '{ printf "%s", $0 }' inputFile

but it deletes only \n but not \r

How should I do to remove the specific combination of CRLF(\r\n) at the end of a line in linux?

P.S.

I think this is not duplicate of this given the specific condition of my question. I want the CRLF(\r\n) at the end of a line to be removed. tr won't work because tr the appearance to be removed would not necessarily at the end of the line, and the admin require not to install dos2unix. In my case sed 's/\r\n$//' inputFile is not working and I pretty much tried all the possible solution in this.

By the way, with tr, the appearance to be removed would not necessarily at the end of the line, and \r\n in tr is a set of \r and \n. In other words, it would delete \r in \r is in the middle of the line.

Clarification:

I have one line input. I want the \r\n to be totally removed.

Community
  • 1
  • 1
Marcus Thornton
  • 5,955
  • 7
  • 48
  • 50
  • 2
    try tr -d '\a\b\r' inputFile – P.... May 12 '16 at 09:49
  • 2
    use dos2unix, or look at one of the hundreds of answers for the same question. – 123 May 12 '16 at 09:50
  • @123 I cannot use dos2unix nor install it because the admin does not allow it. – Marcus Thornton May 12 '16 at 09:56
  • `'s/\r\n$//'` obviously doesn't work as sed is a line reader and doesn't see the newline. Also why would you ever have a carriage return in the middle of a line ? – 123 May 12 '16 at 10:13
  • @123 Why cannot carriage return in the middle of a line? – Marcus Thornton May 12 '16 at 10:14
  • @MarcusThornton It can be, i just cannot see any useful reason that it would be. – 123 May 12 '16 at 10:16
  • 2
    Why would you want the newline to be deleted along with the carriage return? That makes no sense. A text file will be one long spliced line. For scripts that means death. – Jens May 12 '16 at 10:19
  • @123 I cannot define what's going to be input, and every character is possible in a line other than `\n` – Marcus Thornton May 12 '16 at 10:20
  • Your question should be clarified a little. If you see `\r\n` in a file you want it totally removed? Or do you want it to be replaced with `\n`? – Aaron McDaid May 12 '16 at 10:22
  • @Jens, I agree it seems like a strange request. But it does appear that the questioner does want `\r\n` to be totally removed. (Such that only *isolated* `\r` or isolated `\n` would remain) – Aaron McDaid May 12 '16 at 11:28
  • @123 If you are doing some kind of character animation in a terminal using terminal control characters; you might have several CRs in the one line to send the cursor back so you can overwrite your previous output. This kind of thing was common in the 80's when chatting on arpanet. – Niall Cosgrove May 12 '16 at 12:01

8 Answers8

3

Answering last comment: Only one line... Pure :

read string <InputFile
echo -n "${string%$'\r'}"

Explanation: read will read by line, so drop naturally trailing newline. Then ${variable%$'\r'} will remove 1 trailing CR.

Have a look at help read for limitation and options about doing this way:

printf ' foo\\x\r\t bar\r\n' > InputFile
IFS= read -r string <InputFile 
echo -n "${string%$'\r'}" | od -A n -t a -t c
      sp   f   o   o   \   x  cr  ht  sp   b   a   r
           f   o   o   \   x  \r  \t       b   a   r

(I use -t c and -t a because the second is more readable but don't show spaces explicitely.)

This may work under regular too:

CR=`printf \\\r`
read string <InputFile
echo -n "${string%$CR}"

1st answer: End of line and line separator

To whipe a CR at end of line, use this:

sed -e 's/\r$//'

Under Unix's sed, lines are separated by \n, so while you don't use N sed command, you may never found \n in one line.

But if you want to merge all your lines:

sed -ne ':;N;$!b;s/\r\n//g;p'

This will drop all CRLF except at very end of file. (you could drop with bash ${var%$'\r\n'} or head -c -2 )

sed -ne ':;N;$!b;s/\r\n//g;p' | head -c -2
F. Hauri - Give Up GitHub
  • 64,122
  • 17
  • 116
  • 137
  • 1
    Doesn't remove the newline as well. – 123 May 12 '16 at 10:17
  • Answer edited (anyway, I'm not sure to understand you need... ) – F. Hauri - Give Up GitHub May 12 '16 at 10:28
  • wrt `This will drop all CRLF except at very end of file.` - the OP has a single line so the CRLF at the end of the file is the only one he needs dropped. – Ed Morton May 12 '16 at 12:21
  • @EdMorton Ok, for a *one-line* input, things are very simplier... Answer edited! – F. Hauri - Give Up GitHub May 12 '16 at 14:04
  • The simpler solution you posted will undesirably strip all leading and trailing white space and remove all backslashes from the line. Try it on a file you create with `printf ' foo\\xbar\r\n' > InputFile`. You at least need to set IFS to null and add the -r option to read: `IFS= read -r string – Ed Morton May 12 '16 at 14:11
  • @EdMorton Ok, have a look at `help read` and try, with your test case `printf ' foo\\xbar\r\n' > InputFile`, this: `read -r string – F. Hauri - Give Up GitHub May 12 '16 at 15:51
  • @EdMorton By using `IFS= read -r`, the only thing [tag:bash] (v4.2+) could not handle is `$'\0'`. But I've find a way... There is a [base64 encoder](http://f-hauri.ch/vrac/base64.sh.txt) written in **pure bash**. ( You may also find an [ungolfed base64 encoder](http://f-hauri.ch/vrac/base64_ungolfed.sh.txt) version too ;-). It's a *proof of concept*, this work with binary files, but it's very slow!! – F. Hauri - Give Up GitHub May 12 '16 at 16:07
1

Perl is reasonably portable, and well equipped to handle this.

perl -pe 's/\r\n//' file

This will leave any lone \r or \n but remove them both if they occur one after the other in this specific order.

tripleee
  • 175,061
  • 34
  • 275
  • 318
1

A completely different solution, just for fun (but it works). Assuming you've got xxd installed:

xxd -ps -c 1 inputFile |
    awk 'BEGIN {prev=""} {if ($0=="0a" && prev=="0d") {prev="skip"} else { if (prev!="skip" && prev!="") {print prev} prev=$0 } } END {if (prev!="") {print prev}}' |
    xxd -r -ps

Basically it translates the file into 2-digit hex per each character, then filters it with awk, looking for 2 matching lines ("0d" "0a", which is \r\n) in a row and skips them.

But in reality, I'd just recommend using python or perl. One of them should already be on the system. For example:

<inputFile python2 -c 'import sys; sys.stdout.write(sys.stdin.read().replace("\r\n",""))'
viraptor
  • 33,322
  • 10
  • 107
  • 191
0

You may use GNU awk :

Before:

0000000   S   i   n   g   o  \n   D   i   n   g   o  \n  \r   P   i   n
0000020   g   o  \r   M   i   n   g   l   o  \r  \n   S   i   n   g   l
0000040   i  \r  \n
0000043

Operation

$ awk 'BEGIN{RS="^$"}{printf "%s",gensub(/\r\n/,"","g")}' file1 > file2  && mv file2 file1

After

$ od -tc file1
0000000   S   i   n   g   o  \n   D   i   n   g   o  \n  \r   P   i   n
0000020   g   o  \r   M   i   n   g   l   o   S   i   n   g   l   i
0000037

You may wish to change gensub(/\r\n/,"","g") to gensub(/\r\n/,"\n","g") in case you wish to replace CRLF with LF.

Notes:

  1. You shouldn't use print inside awk as it will generate a LF at the end. Instead use printf with a format string.
  2. I have made an edit to the answer incorporating the changes suggested by @ed-morton in comment#1. Also, this comment has some platform specific information which might be useful.
sjsam
  • 21,411
  • 5
  • 55
  • 102
  • 1
    You should mention that's gawk/mawk-specific due to `gensub()` and setting `RS="\0"` is non-portable (instead you need to use `RS="^$"` to read in a whole file at a time) and it won't work on all platforms (e.g. cygwin) because in some the underlying C primitives gawk uses to read files will strip the `\r`s before gawk sees them so you need to set `-v BINMODE=3` to stop that from happening. – Ed Morton May 12 '16 at 12:06
  • 1
    @EdMorton : I never knew we could set `RS=^$`, thankyou for this tip. Incorporating this to my answer. :D – sjsam May 12 '16 at 12:13
  • 1
    Yes and it's guaranteed to work in any awk that accepts a multi-char RS since it means `start of string then end of string` across the whole file so the only way it can match the contents of a file is if the file is empty. So it gives you a Record Separator that cannot match any string in a non-empty file and so guarantees to read the whole file as a single record. Use of `\0` is hit-or-miss, see http://www.gnu.org/software/gawk/manual/gawk.html#gawk-split-records. – Ed Morton May 12 '16 at 12:18
0

If it's only one line and you know there's definitely \r\n at the end, you can just use head and strip the last 2 bytes:

head -c -2 inputFile
viraptor
  • 33,322
  • 10
  • 107
  • 191
0

With GNU awk for multi-char RS and Binary Mode:

$ od -tc file
0000000   f   o   o       b   a   r  \r  \n
0000011

$ awk -v BINMODE=3 -v RS='\r\n' -v ORS= '1' file | od -tc
0000000   f   o   o       b   a   r
0000007

Here's why you need to set BINMODE=3:

$ awk '1' file | od -tc
0000000   f   o   o       b   a   r  \n
0000010

$ awk -v BINMODE=3 '1' file | od -tc
0000000   f   o   o       b   a   r  \r  \n
0000011

Without it on some platforms (e.g. cygqwin) gawk never even sees the \r, underlying C primitives remove it.

Ed Morton
  • 188,023
  • 17
  • 78
  • 185
0

I myself think the following method is also good if not familiar with perl or python

sed 's/\r$//' inputFile | awk '{printf "%s", $0}'
Marcus Thornton
  • 5,955
  • 7
  • 48
  • 50
-2

To make unix lines out of DOS lines, you simply have to remove the carriage returns (CR). The sed command is as follows:

sed 's/\r//g' inputfile > outputfile

just give it a try.

A15N
  • 1
  • 1