1

Usually a gawk script processes each line of its stdin. Is it possible to instead specify a system command in the script use the process each line from output of the command in the rest of the script?

For example consider the following simple interaction:

$ { echo "abc"; echo "def"; } | gawk '{print NR ":" $0; }'
1:abc
2:def

I would like to get the same output without using pipe, specifying instead the echo commands as a system command.

I can of course use the pipe but that would force me to either use two different scripts or specify the gawk script inside the bash script and I am trying to avoid that.

UPDATE

The previous example is not quite representative of my usecase, this is somewhat closer:

$ { echo "abc"; echo "def"; } | gawk '/d/ {print NR ":" $0; }'
2:def

UPDATE 2

A shell script parallel would be as follows. Without the exec line the script would read from stdin; with the exec it would use the command that line as input:

/tmp> cat t.sh
#!/bin/bash

exec 0< <(echo abc; echo def)
while read l; do
  echo "line:" $l
done
/tmp> ./t.sh 
line: abc
line: def
Miserable Variable
  • 28,432
  • 15
  • 72
  • 133
  • You can say `gawk '{print NR ":" $0; }' < <(echo "abc"; echo "def")`, for example. – fedorqui Feb 10 '15 at 10:21
  • @fedorqui this is almost identical to the pipe usage. I am looking to put the commands inside the awk script. – Miserable Variable Feb 10 '15 at 14:46
  • Then you are looking for [call a shell command from inside awk and pass some awk variables to the shell command](http://stackoverflow.com/questions/20646819/call-a-shell-command-from-inside-awk-and-pass-some-awk-variables-to-the-shell-co) – fedorqui Feb 10 '15 at 14:50
  • @fedorqui I am not able to relate that to my usecase. Please see my updated example. – Miserable Variable Feb 10 '15 at 14:57
  • It is getting less clear to me what you want to achieve. Could you indicate what is your final goal so it is more clear? – fedorqui Feb 10 '15 at 14:59
  • @fedorqui perhaps my (imagined) use case is far fetched. Does the second example make it any more clear? – Miserable Variable Feb 10 '15 at 15:06
  • 1
    @MiserableVariable I adjusted my answer for your second example. – Tiago Lopo Feb 10 '15 at 15:09
  • Read http://www.gnu.org/software/gawk/manual/gawk.html#Getline_002fVariable_002fPipe and http://www.gnu.org/software/gawk/manual/gawk.html#Getline_002fVariable_002fCoprocess and arguably most importantly http://awk.info/?tip/getline – Ed Morton Feb 10 '15 at 21:48
  • @EdMorton I came across your excellent articles before asking the question but wasn't able to figure it out. Essentially the construct I am looking for is to redirect input to awk to a process in the BEGIN block, much as the `exec 0<` does in the shell script. Is that possible? – Miserable Variable Feb 12 '15 at 05:38
  • If I understand you, yes - `BEGIN{ARGV[1]="file"}` will cause awk to take it's input from `file`. The general form to add files to the list provided as arguments is `BEGIN{ ARGV[ARGC] = "file"; ARGC++ }`. – Ed Morton Feb 12 '15 at 13:34
  • @EdMorton this as closest to what I am looking for. I want to use a process, not a fixed file, is there a way to do that? If not I can execute the command with output redirected to a temp file and then use the above. Not quite the same thing, but I don't need a stream processing. – Miserable Variable Feb 12 '15 at 13:57
  • It's not clear what you want. Update your question to show sample input and expected output instead of just examples of commands you think could do whatever it is you want. – Ed Morton Feb 13 '15 at 04:14
  • @EdMorton thank you for your continued interest :) The general case is: the current construct is `some_command | awk_script`, the construct I desire is `modified_awk_script`, i.e. there is no input to the process; the `some_command` is instead specified somewhere in `modified_awk_script` itself. – Miserable Variable Feb 13 '15 at 08:34

2 Answers2

2

From all of your comments, it sounds like what you want is:

$ cat tst.awk
BEGIN {
    if ( ("mktemp" | getline file) > 0 ) {
        system("(echo abc; echo def) > " file)
        ARGV[ARGC++] = file
    }
    close("mktemp")
}

{ print FILENAME, NR, $0 }

END {
    if (file!="") {
        system("rm -f \"" file "\"")
    }
}

$ awk -f tst.awk
/tmp/tmp.ooAfgMNetB 1 abc
/tmp/tmp.ooAfgMNetB 2 def

but honestly, I wouldn't do it. You're munging what the shell is good at (creating/destroying files and processes) with what awk is good at (manipulating text).

Ed Morton
  • 188,023
  • 17
  • 78
  • 185
  • 1
    This is almost exactly what I meant when I wrote that I will redirect it to a temp file. You are quite correct that this is abusing awk, my choices are (i) use a shell script with inline awk script (ii) use an approach as above. I will have to decide how important syntax coloring is to me :) – Miserable Variable Feb 13 '15 at 16:03
  • @MiserableVariable it's actually just a decision between doing it the right way or the wrong way. The UNIX philosophy is to have a bunch of small tools each of which does one thing well, and a shell to sequence the calls to them. There is some overlap between the tools and the shell in what you CAN do but if you just stick to the right tool for every job you get a much more concise, robust and efficient result every time. – Ed Morton Feb 13 '15 at 16:17
  • 1
    with respect, I would like to point out that no such philosophical system can be entirely consistent and correct and there are cases where rules need to be broken. What you consider simple -- using a shell script and a separate awk script may be unnecessarily complexity under some circumstances. In any cases, SO is on the forum for discussing subjective opinions, and this is borderline subjective. I continue to be grateful for your help. – Miserable Variable Feb 14 '15 at 04:35
1

I believe what you're looking for is getline:

awk '{ while ( ("echo abc; echo def" | getline line) > 0){ print line} }' <<< ''
abc
def

Adjusting the answer to you second example:

awk '{ while ( ("echo abc; echo def" | getline line) > 0){ counter++; if ( line ~ /d/){print counter":"line} } }' <<< ''
2:def

Let's break it down:

awk '{ 
       cmd = "echo abc; echo def"

       # line below will create a line variable containing the ouptut of cmd
       while ( ( cmd | getline line) > 0){ 

          # we need a counter because NR will not work for us
          counter++; 

          # if the line contais the letter d
          if ( line ~ /d/){ 
             print counter":"line
          } 
        } 
    }' <<< ''
    2:def
Tiago Lopo
  • 7,619
  • 1
  • 30
  • 51