3

I am trying to use one variable in my AWK (or GAWK) program to print multiple columns.

I am taking the columns to print from the command line:

gawk -v cols=1,2,3 -f sample.awk -F,

I want to be able to set this variable in my BEGIN{} block, and use it in the main part of my program.

BEGIN{
  split(cols, col_arr, FS)

  i=1;
  col_str = "$"col_arr[1];
  for(col in col_arr){
    if (i > 1){ 
      col_str = col_str",$"col;
    }
    i++;
  } 
}

{
  print col_str
}

However, this will just print "$1,$2,$3". How can I change this to print columns 1, 2, and 3?

DJElbow
  • 3,345
  • 11
  • 41
  • 52

3 Answers3

1

A BEGIN rule is executed once only, before the first input record is read.

Try something like this

awk '{cols = $1 OFS $2 OFS $5; print cols}' file

Update

Either you have to generate script like how Jonathan Leffler showed since unlike the shell (and PERL) AWK does not evaluate variables within strings, or something like this

BEGIN{
       sub(/,$/,"",cols)
       n=split(cols,C,/,/)
}
function _get_cols(i,s){
       for(i=1;i<=n;i++) s = length(s) ? s OFS $(C[i]) : $(C[i])
       return s  
}
{
     print _get_cols()
}

Execute

awk -v cols=2,3, -f test.awk infile

OR Else something like this you have to try

#!/bin/bash

# Usage : _parse <FS> <OFS> 1 2 3 ... n < file
_parse()
{
    local fs="$1"
    local ofs="$2"
    shift 2
    local _s=
    local f

    for f; do
        _s="${_s}\$${f},"
    done
    awk -F"$fs" -v OFS="$ofs" "{ print ${_s%,} }"
}

# Call function
_parse ' ' '\t' 1 3 < infile
Akshay Hegde
  • 16,536
  • 2
  • 22
  • 36
  • I will actually be getting the columns to print from an array, and would like to define the values to print once, instead of continually looping through an array that contains the column numbers to print. I was just trying to keep the example simple. – DJElbow Oct 03 '14 at 04:46
  • Thanks for the examples. I will probably end up using a similar solution. – DJElbow Oct 03 '14 at 16:07
1

You are probably best off using a program (maybe awk) to write the awk script you ultimately run.

For example:

trap "rm -f script.awk; exit 1" 0 1 2 3 13 15

awk '{ printf "{ print ";
       pad = ""; for (i = 1; i <= NF; i++) { printf "%s$%d", pad, $i; pad = ", " }
       print " }"
     }' <<< "1 2 5" > script.awk

awk -f script.awk data.file

rm -f script.awk
trap 0

The columns to be printed are shown as a here string, a Bash feature, but could come from a file, or from other sources as required. The trap commands are shell script that ensure that the temporary file, script.awk, is removed. It might be better to embed the process ID in the name to ensure uniqueness if the script is run concurrently. If you're really worried about it, use mktemp or a similar program to create a more difficult to guess name. There is no requirement that the script file end with .awk; it just makes it clear what it contains if you find it lying around.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
0

Here's how to do it without loops or arrays :

jot -s '' -c - 65 126 | 
mawk -f <( mawk -v __='3,59,8,42,17,39' '
           BEGIN { 
               OFS =(FS = ",")"$"
             $(ORS = _) = __
             
          print "{ print $" ($!(NF=NF) ) " } " }' ) FS= OFS='\f'
C
 {
  H
   j
    Q
     g

what happens in the sub-process awk call is hard-coding in the columns needed by generating this code on the fly :

# gawk profile, created Mon Jan 16 18:27:35 2023

# Rule(s)

 1  {
 1      print $3, $59, $8, $42, $17, $39
}
RARE Kpop Manifesto
  • 2,453
  • 3
  • 11