You are trying to make a substitution of the hex string \x0D\x0A
which is nothing more than CRLF
or \r\n
.
Since awk by default splits its records on the <newline> character (which is LF
), you actually never have to try to match your <newline> character \n
(or \x0a
). So all you need to do is substitute \r
into ,\r
(0x2c
is the hex value of ,
). So this should do the trick:
awk '(NR>1){sub("\r$",",\r"); print}' file
So why was your script failing?
As mentioned before, awk works in records and the default record separator is the <newline> character. This means that the <newline> character, also written as \n
and having hexadecimal value \x0a
, is never part of the record $0
. Also, the print statement automatically adds its record output separator ORS
after the record. By default this is again the <newline> character. So you did not have to try to substitute that. All you had to do was:
awk 'NR > 1 {sub(/\x0D$/,"\x2C\x0D"); print}' test.csv > testfixed.csv
So is it possible to substitute by means of its hexacedimal values?
Yes, clearly it is!
echo -n "Hello World" | awk 'sub(/\x57\x6f\x72\x6c\x64/,"\x43\x6f\x77")'
But how can I change <newline>?
You can just redefine the output record separator ORS
:
awk -v ORS="whatever" '1'
Also, using GNU awk, you can follow glenn jackman's solution.
Very much related: Why does my tool output overwrite itself and how do I fix it?