7

I would like to be able to pass an array variable to awk. I don't mean a shell array but a native awk one. I know I can pass scalar variables like this:

awk -vfoo="1" 'NR==foo' file

Can I use the same mechanism to define an awk array? Something like:

$  awk -v"foo[0]=1" 'NR==foo' file
awk: fatal: `foo[0]' is not a legal variable name

I've tried a few variations of the above but none of them work on GNU awk 4.1.1 on my Debian. So, is there any version of awk (gawk,mawk or anything else) that can accept an array from the -v switch?

I know I can work around this and can easily think of ways to do so, I am just wondering if any awk implementation supports this kind of functionality natively.

terdon
  • 3,260
  • 5
  • 33
  • 57
  • I don't know the answer for sure but I suspect not, given that to the best of my knowledge, awk doesn't even support array initialisation like `a = [1, 2, 3]`... – Tom Fenech Oct 13 '15 at 14:57
  • 1
    I don't think this is possible. Note @TomFenech that this initialisation is not possible because in `awk` they are only associative. – fedorqui Oct 13 '15 at 14:59
  • you can't define it outside of awk but you can transform a bash array to awk array by serialization, see http://stackoverflow.com/a/32887826/1435869 – karakfa Oct 13 '15 at 15:16
  • @karakfa thanks, but I am aware of that. That's why I specifically pointed out that "I don't mean a shell array but a native `awk` one". – terdon Oct 13 '15 at 15:23

5 Answers5

7

You can use the split() function inside mawk or gawk to split the input of the "-v" value (here is the gawk man page):

split(s, a [, r [, seps] ])

Split the string s into the array a and the separators array seps on the regular expression r, and return the number of fields.*

An example here in which i pass the value "ARRAYVAR", a comma separated list of values which is my array, with "-v" to the awk program, then split it into the internal variable array "arrayval" using the split() function and then print the 3rd value of the array:

echo 0 | gawk -v ARRAYVAR="a,b,c,d,e,f" '{ split(ARRAYVAR,arrayval,","); print(arrayval[3]) }'
c

Seems to work :)

fedorqui
  • 275,237
  • 103
  • 548
  • 598
rmicmir
  • 81
  • 2
  • 1
    Yes, thank you, I know. That's why I specifically said that "I don't mean a shell array but a native awk one." The question is about passing an `awk` array using the `-v` switch, not about generating the array inside the script. – terdon Oct 13 '15 at 21:51
  • 1
    Note you can test your `awk` by saying `awk 'BEGIN {actions here}`. No need to `echo ... | awk`. – fedorqui Oct 14 '15 at 13:58
1

It looks like it is impossible by definition.

From man awk we have that:

-v var=val

--assign var=val

Assign the value val to the variable var, before execution of the program begins. Such variable values are available to the BEGIN rule of an AWK program.

Then we read in Using Variables in a Program that:

The name of a variable must be a sequence of letters, digits, or underscores, and it may not begin with a digit.

Variables in awk can be assigned either numeric or string values.

So the way the -v implementation is defined makes it impossible to provide an array as a variable, since any kind of usage of the characters = or [ is not allowed as part of the -v variable passing. And both are required, since arrays in awk are only associative.

fedorqui
  • 275,237
  • 103
  • 548
  • 598
1

If you don't insist on using -v you could use -i (include) instead to read an awk file that contains the variable settings. Like this:

if F=$(mktemp inputXXXXXX); then
    cat >$F << 'END'
BEGIN {
    foo[0]=1
}
END
cat $F
    awk -i $F 'BEGIN { print foo[0] }' </dev/null
    rm $F
fi

Sample trace (using gawk-4.2.1):

bash -x /tmp/test.sh 
++ mktemp inputXXXXXX
+ F=inputrpMsan
+ cat
+ cat inputrpMsan
BEGIN {
    foo[0]=1
}
+ awk -i inputrpMsan 'BEGIN { print foo[0] }'
1
+ rm inputrpMsan
U. Windl
  • 3,480
  • 26
  • 54
  • Thanks, but as explained in the question, "I don't mean a shell array but a native awk one." The question is about passing an awk array using the -v switch, not about generating the array inside the script or other workarounds. In any case, if I were to have the array in a file, I would just write the entire awk script in the file. – terdon Mar 18 '22 at 09:04
  • Where do you see a shell array in the example? In your question you specified the value on the awk command line; for this example you'd have to create a file with the values (from awk) before calling the awk that uses those values. With the code in the original question it's hard to have better code in the answer. – U. Windl Mar 18 '22 at 10:25
  • Fair point, I expressed myself badly in my previous comment, sorry. I meant that the question is specifically about passing a native awk array on launch using the -v switch and I was not looking for workarounds, I only wanted to know if it is possible with the -v switch. That said, I have now had my second coffee, so I can now see the benefits of your approach. No, it isn't what I asked for, but it does at least give me a way of defining an array and then re-using it in as many awk one liners as I want. Thanks! I am guessing this is a GNU awk feature, right? – terdon Mar 18 '22 at 11:58
0

Unfortunately, this is not possible. However, you can convert a bash array to an awk array using a few clever methods.

I wanted to do this recently by passing a bash array to awk to use it for filtering, so here is what I did:

$ arr=( hello world this is bash array )
$ echo -e 'this\nmight\nnot\nshow\nup' | awk 'BEGIN {
  for (i = 1; i < ARGC; i++) {
      my_filter[ARGV[i]]=1
      ARGV[i]="" # unset ARGV[i] otherwise awk might try to read it as a file
  }
} !my_filter[$0]' "${arr[@]}"

Output:

might
not
show
up
smac89
  • 39,374
  • 15
  • 132
  • 179
  • Yes, thank you, I know. That's why I specifically said that "I don't mean a shell array but a native awk one." The question is about passing an awk array using the -v switch, not about generating the array inside the script – terdon Apr 11 '21 at 11:58
-1

For associative arrays, you could pass it as a string of key-value pairs, and then reformat it in the BEGIN section.

$ echo | awk -v m="a,b;c,d" '
BEGIN {
  split(m,M,";")
  for (i in M) {
    split(M[i],MM,",")
    MA[MM[1]]=MM[2]
  }
}
{
  for (a in MA) {
    printf("MA[%s]=%s\n",a, MA[a])
  }
}'

Output:

MA[a]=b
MA[c]=d
remgeek
  • 1
  • 1
  • Thanks, but as I said in the question, I know how to do workarounds for this. The question was asking specifically if an array can be passed as a variable. Awk is a fully fledged programming language, so of course you can do things like you describe here. – terdon Jan 05 '22 at 09:57
  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Jan 05 '22 at 14:29