0

How do I replace single zeros (0) with string NA in a tab-separated-values file?

Suppose I have the table:

0\t0.15\t0t\8.05\t0\t0\t0.15\7.0306\n
5\t0.18\t0\8.05\t0\t0\t0.5t\50\n
1\t15\t0205\t0\t0.16\t200t\40.90\n 

I would like to get:

NA\t0.15\NAt\8.05\tNA\tNA\t0.15t\7.0306\n
5\t0.18\tNA\8.05\tNA\tNA\t0.5t\50\n
1\t15\t0205\tNA\t0.16\t200t\40.90\n 

That is, I would like to match the null measures of the data frame.

mklement0
  • 382,024
  • 64
  • 607
  • 775
fred
  • 9,663
  • 3
  • 24
  • 34

3 Answers3

4

awk enables a robust, portable solution:

awk 'BEGIN {FS=OFS="\t"} {for (i=1; i<=NF; ++i) { if ($i=="0") {$i="NA"} }; print}' file
  • BEGIN {FS=OFS="\t"} tells awk - before input processing begins (BEGIN) - to split input lines into fields by tab characters (FS="\t") and to also separate them by tab characters on output (OFS="\t").

    • Reserved variable FS is the [input] field separator; OFS is the output field separator.
  • for (i=1; i<=NF; ++i) loops over all input fields (NF is the count of input fields), resulting from splitting each input line by tabs.

    • if ($i=="0") {$i="NA"} tests each field for being identical to string 0 and, if so, replaces that field ($i) with string NA.

    • On assigning to a field, the input line at hand is implicitly rebuilt from the (modified) field values, using the value of OFS as the separator.

  • print simply prints the (potentially modified) input line at hand.

mklement0
  • 382,024
  • 64
  • 607
  • 775
0

With GNU sed:

sed -E ':a;s/(\t)*\b0\b(\t)/\1NA\2/g;ta;' file

Using backreference, this replace 0 eventually preceded of followed by a tab(\t) with NA and captured tab.

Graham
  • 7,431
  • 18
  • 59
  • 84
SLePort
  • 15,211
  • 3
  • 34
  • 44
0

With GNU or OSX sed for -E for EREs:

$ sed -E 's/(^|\t)0(\t|$)/\1NA\2/g; s/(^|\t)0(\t|$)/\1NA\2/g' file
NA      0.15    NA      8.05    NA      NA      0.15    7.0306
5       0.18    NA      8.05    NA      NA      0.5     50
1       15      NA      205     NA      0.16    200     40.90

See https://stackoverflow.com/a/44908420/1745001 for why it takes 2 passes.

Ed Morton
  • 188,023
  • 17
  • 78
  • 185