You don't need to set locale, but need to account for strange or errorneous input :
If the input has a dot, or any character than has a byte ordinance higher than ASCII "1" (which is a LOT of stuff) :
9_6_F-repl 24834 9.
9_6_F 56523 9.
9_annua_M-merg 122663 9.
9_huetii_F-merg 208077 9.
9_annua_M-merg 122663 :5.333
this would completely fail to produce the correct result, since $3
is being compared as a string, where an ASCII "9" is larger than ASCII "1" :
mawk2 'sub("\r*",_)*(10<$3)'
9_6_F-repl 24834 9.
9_6_F 56523 9.
9_annua_M-merg 122663 9.
9_huetii_F-merg 208077 9.
9_annua_M-merg 122663 9.
9_annua_M-merg 122663 :5.333
To rectify it, simply add +
next to $3
:
mawk 'sub("\r*",_)*(10<+$3)'
If you don't care much for archaic gawk -P/-c/-t
modes then it's even simpler :
mawk '10<+$3' RS='\r?\n'
Let ORS
take care of the \r
::CR on your behalf. By placing the ?
at the RS regex, you can skip all the steps about using iconv
or dos2unix
or changing locale
settings ::
RS
—-->ORS
would seamlessly handle it
This way the original input file remains intact, in case you need those CRs later for some reason.