0

I have a file like this:

head CHR.17.dat
PANEL   FILE    ID  CHR P0  P1  HSQ BEST.GWAS.ID    BEST.GWAS.Z EQTL.ID EQTL.R2 EQTL.Z  EQTL.GWAS.Z NSNP    NWGT    MODEL   MODELCV.R2  MODELCV.PV  TWAS.Z  TWAS.P
NA  ./weight//retina/retina.ENSG00000002834.wgt.RDat    LASP1   17  38869859    38921770    0.082133    NA     NA   NA         NA       NA        NA      0   0 enet     4.53e-02   9.26e-06          NA         NA
NA  ./weight//retina/retina.ENSG00000002919.wgt.RDat    SNX11   17  48103357    48123074    0.014947    NA     NA   NA         NA       NA        NA      0   0 lasso   -1.91e-03   6.32e-01          NA         NA
NA  ./weight//retina/retina.ENSG00000004139.wgt.RDat    SARM1   17  28364356    28404049    0.095283    rs8076604    2.52   rs1128162   -1.96e-0 -3.31   0.67843     94  94 blup     2.19e-02   1.66e-03     0.64027    0.52200

...

can you please explain what this command suppose to do:

cat CHR.17.dat | awk 'NR == 1 || $NF < 0.05/461'
anamaria
  • 341
  • 3
  • 11
  • 1
    Does this answer your question? [bash which OR operator to use - pipe v double pipe](https://stackoverflow.com/questions/41625521/bash-which-or-operator-to-use-pipe-v-double-pipe) – Wiktor Stribiżew May 06 '20 at 21:22
  • @WiktorStribiżew that's a very different context and semantics for `||` - in the context of that question it's not an `OR` in the same way as it is in the current context, in that context it means `if the last command executed failed then do the following...`. To read it as an `OR` in shell you need to think of shell executing each command with an assumption that it'll succeed and then `cmd1 || cmd2` means something like `cmd1 succeeded so do nothing OR execute cmd2` but IMHO it's a stretch and it's more like an `else` but it's not that either. – Ed Morton May 06 '20 at 21:53
  • I say "not that either" because if it was an `else` then `cmd1 && cmd2 || cmd3` would mean `execute cmd1, if it succeeds call cmd2 else call cmd3` but it doesn't mean that, it means `execute cmd1; if it succeeds call cmd2; if cmd1 or cmd2 failed then call cmd3` because it applies to the exit status of **the last command executed** and if `cmd1` failed then `cmd2` would not be executed and so the `||` would be applied to `cmd1`s exit status but if `cmd1` succeeded then `cmd2` would be executed so then it's `cmd2`s exit status the `||` would be applied to. Anyway - nothing like an `||` in awk! – Ed Morton May 06 '20 at 22:02

1 Answers1

1

As in many languages, in awk || means or. That command will produce output if the current input line is the first one (NR == 1) or (||) the value of the last input field ($NF) on the current line is less than the given value ($NF < 0.05/461).

So it's printing the header line and any other lines for which the 2nd condition is true.

This involves a UUOC though:

cat CHR.17.dat | awk 'NR == 1 || $NF < 0.05/461'

and should instead be written:

awk 'NR == 1 || $NF < 0.05/461' CHR.17.dat
Ed Morton
  • 188,023
  • 17
  • 78
  • 185
  • Thank you so much, so basically the line will be printed if TWAS.P< 0.05/461? – anamaria May 06 '20 at 21:22
  • No. The line that ends in TWAS.P will be printied because it's the 1st line and so the condition `NR==1` is true. Every OTHER line will be printed when it's last field is less than `0.05/461`. Why not try it with a simpler comparison like `$NF < 10` and simpler input with lines like `a b 5` and `c d 15`? – Ed Morton May 06 '20 at 21:25
  • Thank you so much, I know it's a bit outside of the scope of this question but how to print lines where TWAS.P< 0.05/461? I do have NAs there as well – anamaria May 06 '20 at 21:28
  • `TWAS.P` in a numeric comparison context like you show is the number `0` so I don't know what you mean by `print lines where TWAS.P< 0.05/461` since that condition is always true. Sorry, you should post a new question. – Ed Morton May 06 '20 at 21:31
  • Hang on - when you say `TWAS.P` do you actually mean "the value of the field in the column that has `TWAS.P` as it's line 1 header string"? If so then [my first response](https://stackoverflow.com/questions/61645552/how-to-explain-awk-command-with-double-pipe/61645574#comment109043062_61645574) would be "yes" instead of "no" as the column with `TWAS.P` at the top of it is the last column and so `$NF` represents the value in that column for every input line. – Ed Morton May 06 '20 at 21:34
  • yes the value in the field of TWAS.P. So I would like to print all lines where the value in the filed of TWAS.P is < 0.05/461. Can you please tell me how to do that? – anamaria May 06 '20 at 21:43
  • 1
    That's exactly what your script already does. Again, why not try it on some simpler data and with a simpler condition so you can understand the semantics better. – Ed Morton May 06 '20 at 21:43