0

I have an awk command which works perfectly:

awk '{a[$1]++}END{for(i in a){printf i"\t"a[i]"\n"}}' infile

It counts the number of repeats in $1. The output looks like this:

MTRFHLILLPLLFSWFSYCFG_1    1
MLAELSVAFTLAAFALA_rc_1     3

I would like to make the output red. \033[01;31m

Usually, when I want to colour the output in awk, I do it like this:

RED='\033[01;31m'
NONE='\033[0m'

awk -v r=$RED -v n=$NONE '{printf r$1n"\n"}' infile 

I tried this with the command I described above (counts the number of repeats in $1), but it doesn't work. I think it is because awk is not able to recognise r and i as separate variables, for example, in bash I would use $r$i. Is this the case?

Here is the command I have tried:

awk -v r=$RED -v n=$NONE '{a[$1]++}END{for(i in a){printf ri"\t"a[i]"\n"n}}' infile

The output looks like this:

1 #See how the first half of the output (i) is missed and is not coloured. 
3

Can anybody explain why this is not working and help me fix it?

Thank you

Jpike
  • 187
  • 8
  • 1
    `it doesn't work` is the worst possible problem statement as it doesn't tell us anything useful about the problem to debug it. Also the script that statement refers to, `awk -v r=$RED -v n=$NONE '{printf r$1n"\n"}' infile`, actually DOES work which is misleading, it's just the script later in your question that doesn't work. – Ed Morton Mar 10 '21 at 14:08

1 Answers1

4

Since the question just says about the code that "it doesn't work" without saying in what way it doesn't work, here are a list of things in the code that might be causing it to "not work":

  1. Always quote your shell variables (r="$RED", not r=$RED), see https://mywiki.wooledge.org/Quotes. Quotes are something you must use by default and remove when you need to, not something you add when you need to.
  2. To concatenate variables you need to leave some separator between them. Given variables named r and i you can concatenate them with r i or (r)(i), but if you write ri that's just another variable named ri.
  3. Always do printf "%s", foo, not printf foo, for any foo that contains input data as the latter will fail whenever foo contains print formatting characters such as %s.
  4. Don't use all-upper-case for non-exported shell variables (see Correct Bash and shell script variable capitalization).
red='\033[01;31m'
none='\033[0m'

seq 3 | awk -v r="$red" -v n="$none" '{printf "%s%s%s\n", r, $1, n}'
1
2
3

And for your other script:

seq 3 | awk -v r="$red" -v n="$none" '{a[$1]++}END{for(i in a){printf "%s%s\t%s%s\n", r, i, a[i], n}}'
1       1
2       1
3       1

It's not obvious why you're defining your colors as shell variables and passing those values to awk instead of just defining them in awk though:

seq 3 | awk '
    BEGIN {
        red  = "\033[01;31m"
        none = "\033[0m"
    }
    { printf "%s%s%s\n", red, $1, none }
'
1
2
3

(All output above is colored red, honest!).

Ed Morton
  • 188,023
  • 17
  • 78
  • 185
  • 1
    Thank you for your help - much appreciated. Apologies, I should have made clearer in what way the code does not work. As you have amended your answer to indicate that the way in which I wrote the question was incorrect, I'll leave the question as it is so that others may learn from my mistake. Solution 2 fixes the issue, so I assume that the problem was the concatenation of variables. But, having read through the links etc., it seems that the approaches I have been using, although they work, are incorrect/poor practice. I will try to do things correctly from now on. Thanks :) – Jpike Mar 10 '21 at 14:45