59

I am trying to run a shell command from within awk for each line of a file, and the shell command needs one input argument. I tried to use system(), but it didn't recognize the input argument.

Each line of this file is an address of a file, and I want to run a command to process that file. So, for a simple example I want to use 'wc' command for each line and pass $1to wc.

awk '{system("wc $1")}' myfile
user229044
  • 232,980
  • 40
  • 330
  • 338
Vahid Mirjalili
  • 6,211
  • 15
  • 57
  • 80
  • 2
    Why do you think awk is the right tool for this job? It seems like `xargs` or a simple shell `while read line` loop would be better and easier. – glenn jackman Dec 18 '13 at 02:58
  • 1
    On the flip side: Why do you think wc is the right tool for this job? It seems like awk builtin variables and functions would be better and easier? – Ed Morton Dec 18 '13 at 13:51

5 Answers5

73

you are close. you have to concatenate the command line with awk variables:

awk '{system("wc "$1)}' myfile
Kent
  • 189,393
  • 32
  • 233
  • 301
  • thanks, that works! but one more question? Can we assign the output to a new variable? – Vahid Mirjalili Dec 17 '13 at 23:37
  • 5
    That's the wrong syntax for this job, it's a wrong application for system(), the print does not do what you think it will do, and no you can't assign the output of a system() call to an awk variable, what you posted in your comment assigns the return code from system() to a variable. Time for some coffee @Kent! – Ed Morton Dec 18 '13 at 13:45
  • 3
    @EdMorton yes! system returns the status code.....I was confused by vimscripts' `system()`...:( . cmd|getline var should read the output...... this answer/comment is not correct for the variable part. – Kent Dec 18 '13 at 14:05
  • 8
    It's also not correct for the system() part - it should be `system("wc \"" $1 "\"")`. – Ed Morton Dec 18 '13 at 14:09
  • How to deal with passing 2 parameters? – uuu777 Jul 03 '19 at 16:28
  • @zzz777 don't get you. perhaps you can ask a question with your example input – Kent Jul 04 '19 at 08:31
52

You cannot grab the output of an awk system() call, you can only get the exit status. Use the getline/pipe or getline/variable/pipe constructs

awk '{
    cmd = "your_command " $1
    while (cmd | getline line) {
        do_something_with(line) 
    }
    close(cmd)
}' file
glenn jackman
  • 238,783
  • 38
  • 220
  • 352
  • 3
    +1 for the correct way to get the output of the shell command, but in general the syntax to create the variable is `cmd = "your_command \"" $1 ""\"` so the argument is quoted when cmd is executed and you need to test for the the result of getline being greater than zero or you'll get stuck in an infinite loop if it fails. – Ed Morton Dec 18 '13 at 13:47
  • 1
    +1 . OP, if you need store the output in a var, accept this answer too. mine was not correct for var assignment. – Kent Dec 18 '13 at 14:08
  • @EdMorton I think you mean `cmd = "your_command \"" $1 "\""` – jaygooby Dec 15 '21 at 10:32
  • 1
    @jaygooby yes I do, well spotted. – Ed Morton Dec 15 '21 at 13:10
2

FYI here's how to use awk to process files whose names are stored in a file (providing wc-like functionality in this example):

gawk '
NR==FNR { ARGV[ARGC++]=$0; next }
{ nW+=NF; nC+=(length($0) + 1) }
ENDFILE { print FILENAME, FNR, nW, nC; nW=nC=0 }
' file

The above uses GNU awk for ENDFILE. With other awks just store the values in an array and print in a loop in the END section.

Ed Morton
  • 188,023
  • 17
  • 78
  • 185
1

I would suggest another solution:

awk '{print $1}' myfile | xargs wc

the difference is that it executes wc once with multiple arguments. It often works (for example, with kill command)

0

Or use the pipe | as in bash then retrive the output in a variable with awk's getline, like this

 zcat /var/log/fail2ban.log* | gawk  '/.*Ban.*/  {print $7};' | sort | uniq -c | sort | gawk '{ "geoiplookup " $2 "| cut -f2 -d: " | getline geoip; print $2 "\t\t" $1 " " geoip}'

That line will print all the banned IPs from your server along with their origin (country) using the geoip-bin package.

The last part of that one-liner is the one that affects us :

gawk '{ "geoiplookup " $2 "| cut -f2 -d: " | getline geoip; print $2 "\t\t" $1 " " geoip}'

It simply says : run the command "geoiplookup 182.193.192.4 | -f2 -d:" ($2 gets substituted as you may guess) and put the result of that command in geoip (the | getline geoip bit). Next, print something something and anything inside the geoip variable.

The complete example and the results can be found here, an article I wrote.

ychaouche
  • 4,922
  • 2
  • 44
  • 52