385

I found some ways to pass external shell variables to an awk script, but I'm confused about ' and ".

First, I tried with a shell script:

$ v=123test
$ echo $v
123test
$ echo "$v"
123test

Then tried awk:

$ awk 'BEGIN{print "'$v'"}'
$ 123test
$ awk 'BEGIN{print '"$v"'}'
$ 123

Why is the difference?

Lastly I tried this:

$ awk 'BEGIN{print " '$v' "}'
$  123test
$ awk 'BEGIN{print ' "$v" '}'
awk: cmd. line:1: BEGIN{print
awk: cmd. line:1:             ^ unexpected newline or end of string 

I'm confused about this.

codeforester
  • 39,467
  • 16
  • 112
  • 140
hqjma
  • 3,853
  • 3
  • 13
  • 5
  • 2
    I like the -v as shown below, but this is really a great exercise in thinking about how to protect things from the shell. Working through this, my first cut use backslashes on spaces and dollar signs. Needless to say the examples here were well worth my time. – Chris Dec 20 '16 at 21:00
  • Related: [Difference between single and double quotes in awk](https://stackoverflow.com/q/44445852/6862601). – codeforester May 11 '18 at 18:45
  • 2
    If your awk search needs **regular expression**, you can't put `/var/`. Instead, use tilde: `awk -v var="$var" '$0 ~ var'` – Noam Manos May 07 '20 at 09:29
  • @NoamManos, why is it not possible to use a variable inside a reg expression delimited by "//" ? I've been reading a lot of info (and superb awk manual, BTW) for a few hours and I am already a bit overwhelmed, so apologies if this is easy to find out – Kiteloopdesign Aug 10 '22 at 09:42
  • 1
    @Kiteloopdesign because the `/.../` delimiters mean a **literal** regexp so nothing expands inside them. If you don't want a literal regexp then don't use `/.../` delimiters, use `"..."` and/or variables for a **dynamic** regexp. – Ed Morton Jul 12 '23 at 11:23

7 Answers7

639

#Getting shell variables into awk may be done in several ways. Some are better than others. This should cover most of them. If you have a comment, please leave below.                                                                                    v1.5


Using -v (The best way, most portable)

Use the -v option: (P.S. use a space after -v or it will be less portable. E.g., awk -v var= not awk -vvar=)

variable="line one\nline two"
awk -v var="$variable" 'BEGIN {print var}'
line one
line two

This should be compatible with most awk, and the variable is available in the BEGIN block as well:

If you have multiple variables:

awk -v a="$var1" -v b="$var2" 'BEGIN {print a,b}'

Warning. As Ed Morton writes, escape sequences will be interpreted so \t becomes a real tab and not \t if that is what you search for. Can be solved by using ENVIRON[] or access it via ARGV[]

PS If you have vertical bar or other regexp meta characters as separator like |?( etc, they must be double escaped. Example 3 vertical bars ||| becomes -F'\\|\\|\\|'. You can also use -F"[|][|][|]".

Example on getting data from a program/function inn to awk (here date is used)

awk -v time="$(date +"%F %H:%M" -d '-1 minute')" 'BEGIN {print time}'

Example of testing the contents of a shell variable as a regexp:

awk -v var="$variable" '$0 ~ var{print "found it"}'

Variable after code block

Here we get the variable after the awk code. This will work fine as long as you do not need the variable in the BEGIN block:

variable="line one\nline two"
echo "input data" | awk '{print var}' var="${variable}"
or
awk '{print var}' var="${variable}" file
  • Adding multiple variables:

awk '{print a,b,$0}' a="$var1" b="$var2" file

  • In this way we can also set different Field Separator FS for each file.

awk 'some code' FS=',' file1.txt FS=';' file2.ext

  • Variable after the code block will not work for the BEGIN block:

echo "input data" | awk 'BEGIN {print var}' var="${variable}"


Here-string

Variable can also be added to awk using a here-string from shells that support them (including Bash):

awk '{print $0}' <<< "$variable"
test

This is the same as:

printf '%s' "$variable" | awk '{print $0}'

P.S. this treats the variable as a file input.


ENVIRON input

As TrueY writes, you can use the ENVIRON to print Environment Variables. Setting a variable before running AWK, you can print it out like this:

export X=MyVar
awk 'BEGIN{print ENVIRON["X"],ENVIRON["SHELL"]}'
MyVar /bin/bash

or for a non-exported variable:

x=MyVar
x="$x" awk 'BEGIN{print ENVIRON["x"],ENVIRON["SHELL"]}'
MyVar /bin/bash

ARGV input

As Steven Penny writes, you can use ARGV to get the data into awk:

v="my data"
awk 'BEGIN {print ARGV[1]}' "$v"
my data

To get the data into the code itself, not just the BEGIN:

v="my data"
echo "test" | awk 'BEGIN{var=ARGV[1];ARGV[1]=""} {print var, $0}' "$v"
my data test

Variable within the code: USE WITH CAUTION

You can use a variable within the awk code, but it's messy and hard to read, and as Charles Duffy points out, this version may also be a victim of code injection. If someone adds bad stuff to the variable, it will be executed as part of the awk code.

This works by extracting the variable within the code, so it becomes a part of it.

If you want to make an awk that changes dynamically with use of variables, you can do it this way, but DO NOT use it for normal variables.

variable="line one\nline two"
awk 'BEGIN {print "'"$variable"'"}'
line one
line two

Here is an example of code injection:

variable='line one\nline two" ; for (i=1;i<=1000;++i) print i"'
awk 'BEGIN {print "'"$variable"'"}'
line one
line two
1
2
3
.
.
1000

You can add lots of commands to awk this way. Even make it crash with non valid commands.

One valid use of this approach, though, is when you want to pass a symbol to awk to be applied to some input, e.g. a simple calculator:

$ calc() { awk -v x="$1" -v z="$3" 'BEGIN{ print x '"$2"' z }'; }

$ calc 2.7 '+' 3.4
6.1

$ calc 2.7 '*' 3.4
9.18

There is no way to do that using an awk variable populated with the value of a shell variable, you NEED the shell variable to expand to become part of the text of the awk script before awk interprets it. (see comment below by Ed M.)


Extra info:

Use of double quote

It's always good to double quote variable "$variable"
If not, multiple lines will be added as a long single line.

Example:

var="Line one
This is line two"

echo $var
Line one This is line two

echo "$var"
Line one
This is line two

Other errors you can get without double quote:

variable="line one\nline two"
awk -v var=$variable 'BEGIN {print var}'
awk: cmd. line:1: one\nline
awk: cmd. line:1:    ^ backslash not last character on line
awk: cmd. line:1: one\nline
awk: cmd. line:1:    ^ syntax error

And with single quote, it does not expand the value of the variable:

awk -v var='$variable' 'BEGIN {print var}'
$variable

More info about AWK and variables

Read this faq.

Ed Morton
  • 188,023
  • 17
  • 78
  • 185
Jotne
  • 40,548
  • 12
  • 51
  • 55
  • 2
    I strongly disagree that `-v` is the "best, most portable way". `awk -v a=b cmds path1 path2` is (almost) equivalent to `awk cmds a=b path1 path2`, but there is no good way to use `-v` to emulate `awk cmds path1 a=b path2` Defining variables in the arguments is an extremely useful technique which is equally portable and which I will argue is "better". – William Pursell Feb 08 '21 at 21:48
  • @WilliamPursell when you define your variables among the file names in the args list a) they aren't set in the `BEGIN` section and b) they are interleaved with the file names in `ARGV[]` and so make it harder to loop on file names, compare the current `FILENAME` to an `ARGV[]` position, e.g. to use `FILENAME==ARGV[1]` instead of `NR==FNR` to avoid empty input file issues in multi-input-file scripts. IMHO the only time to do that is when you need to change the values of variables (e.g. `FS`) between files, otherwise use `-v` or `ENVIRON[]` for the most intuitive use of variables. – Ed Morton Jan 12 '23 at 14:56
  • Regarding `there is no good way to use -v to emulate awk cmds path1 a=b path2` - you could also claim there's no good way to use that approach to emulate `awk -v a=b cmds path1 path2`, as they just have different semantics. IMHO it's easier to emulate `awk cmds path1 a=b path2` with `awk -v a=b cmds path1 path2` than the other way around though as `a` is simply not available in the BEGIN section the first way and it's pretty easy to, in the BEGIN section, save/clear/set it between files the second way. – Ed Morton Jan 12 '23 at 15:13
34

It seems that the good-old ENVIRON built-in hash is not mentioned at all. An example of its usage:

$ X=Solaris awk 'BEGIN{print ENVIRON["X"], ENVIRON["TERM"]}'
Solaris rxvt
TrueY
  • 7,360
  • 1
  • 41
  • 46
  • 4
    This is a good suggestion because it passes the data verbatim. `-v` doesn't work when the value contains backslashes. – that other guy Feb 23 '16 at 21:45
  • 2
    @thatotherguy I did not know that! I thought that if I use `awk -v x='\c\d' ...` then it will be used it properly. But when `x` is printed [tag:awk] drops the famous: `awk: warning: escape sequence '\c' treated as plain 'c'` error message... Thanks! – TrueY Feb 24 '16 at 09:11
  • 1
    It does work properly - properly in this context means expand escape sequences because that's how `-v` was designed to work so you can use `\t` in the variable and have it match a literal tab in the data, for example. If that's not the behavior you want then you don't use `-v` you use `ARGV[]` or `ENVIRON[]`. – Ed Morton Jul 07 '19 at 15:02
10

You could pass in the command-line option -v with a variable name (v) and a value (=) of the environment variable ("${v}"):

% awk -vv="${v}" 'BEGIN { print v }'
123test

Or to make it clearer (with far fewer vs):

% environment_variable=123test
% awk -vawk_variable="${environment_variable}" 'BEGIN { print awk_variable }'
123test
johnsyweb
  • 136,902
  • 23
  • 188
  • 247
  • This just reiterates part of the accepted answer but will only work in some awks due to no space between `-v` and `v=`. – Ed Morton Jan 13 '23 at 13:28
6

You can utilize ARGV:

v=123test
awk 'BEGIN {print ARGV[1]}' "$v"

Note that if you are going to continue into the body, you will need to adjust ARGC:

awk 'BEGIN {ARGC--} {print ARGV[2], $0}' file "$v"
Zombo
  • 1
  • 62
  • 391
  • 407
  • This just reiterates part of the accepted answer and YMMV with just decrementing ARGC without clearing it's slot in ARGV[]. – Ed Morton Jan 13 '23 at 13:31
1

I just changed @Jotne's answer for "for loop".

for i in `seq 11 20`; do host myserver-$i | awk -v i="$i" '{print "myserver-"i" " $4}'; done
edib
  • 812
  • 1
  • 11
  • 20
  • 2
    This merely seems to be another illustration of how to use Awk's `-v` option which was already mentioned in many of the existing answers. If you want to show how to run Awk in a loop, that's a different question really. – tripleee Jul 07 '19 at 10:27
0

I had to insert date at the beginning of the lines of a log file and it's done like below:

DATE=$(date +"%Y-%m-%d")
awk '{ print "'"$DATE"'", $0; }' /path_to_log_file/log_file.log

It can be redirect to another file to save

Sina
  • 431
  • 4
  • 7
  • The double quote - single quote - double quote was exactly what I needed to make mine work. – user53029 Jul 21 '16 at 14:24
  • 3
    This was already mentioned in the accepted answer as a method you should not use due to code injection vulnerabilities. So the information here is redundant (already described in the accepted answer), and incomplete (does not mention the problems with this method). – Jason S Oct 12 '16 at 05:20
-1

Pro Tip

It could come handy to create a function that handles this so you dont have to type everything every time. Using the selected solution we get...

awk_switch_columns() {
     cat < /dev/stdin | awk -v a="$1" -v b="$2" " { t = \$a; \$a = \$b; \$b = t; print; } "
}

And use it as...

echo 'a b c d' | awk_switch_columns 2 4

Output:
a d c b
ibitebyt3s
  • 2,992
  • 2
  • 15
  • 25
  • See UUOC in https://porkmail.org/era/unix/award. Also - use single instead of double quotes around your awk script (as you always should by default) and then you won't have to escape the `$`s within it because you won't be inviting the shell to interpret it before awk sees it. It's not obvious why you put big, bold "Pro Tip" at the top of this answer, most of the other answers are better and this doesn't add any value to the accepted answer, it just uses it in one specific context. – Ed Morton Jan 13 '23 at 13:27