-3

Continuing from my last post read line by line with awk and parse variables

I want to buffer result of a field to speed parsing of log lines.
I tried
awk 'BEGIN{OFS=","} { FS="\""; $0=$0; CIP=$4; (buffer[CIP]==0) { cmd="geoiplookup "CIP; cmd | getline buffer[CIP]; close(cmd) } ... print "CIP,..." >> mysql.infile }' $1

but I get syntax error...

CIP=[ipaddress]
So I'm trying to buffer IP address(es) so that it will not run geoiplookup script all the time, as it's slowing down the parsing...

Any help is appreciated...

vessel
  • 13
  • 1
  • 6
  • 5
    It's fine to link to previous questions but try to include sufficient context here for this question to be answerable by itself. It's not clear at the moment what you're trying to do. – Tom Fenech Nov 24 '17 at 09:08
  • It's more of a syntax error question, rather than what I'm trying to do actually... Anyway I edit question to make it more clear. – vessel Nov 24 '17 at 10:17
  • Do you have `FS="\""; $0=$0; CIP=$4;` just in the middle of the script, not inside a block `{ }`? – Tom Fenech Nov 24 '17 at 10:22
  • @TomFenech : thanks I did it but still get syntax error at `(buffer[CIP]==0) { cmd="geoiplookup "CIP; cmd | getline buffer[CIP]; close(cmd) }` – vessel Nov 24 '17 at 10:28
  • Now you have an opening `{`, but no closing `}`... – Tom Fenech Nov 24 '17 at 10:33
  • @TomFenech : I have a closing `}' $1` but includes the buffer() also, do you think that's the syntax error? I need buffer to be before print results... – vessel Nov 24 '17 at 10:51
  • Another quick fix, would be to put `if` in front of your buffer line, i.e. `if (buffer[CIP]==0) { cmd="geoiplookup "CIP; cmd | getline buffer[CIP]; close(cmd) }`. As I assume, your awk has only one rule, – kvantour Nov 24 '17 at 11:52
  • If you want help fixing a syntax error, doesn't it just make sense to tell us what the syntax error message is? – Ed Morton Nov 24 '17 at 15:44

1 Answers1

1

I do not see a direct problem with the buffering, as it seems to work in this example ::

echo "172.217.22.132\n172.217.22.132" | \
     awk '{CIP=$1}
      (buffer[CIP]==0) { print "Calling geoiplookup";
        cmd="geoiplookup "CIP;
        cmd | getline buffer[CIP];
        close(cmd) }
      {print buffer[CIP]}'

This produces:

Calling geoiplookup
GeoIP Country Edition: US, United States
GeoIP Country Edition: US, United States

As you see it is called only once, so the buffering works.

There is however a bug in your code, the following should work better.

awk  'BEGIN{OFS=","}
{ FS="\""; $0=$0; CIP=$4; }
(buffer[CIP]==0) { cmd="geoiplookup "CIP; cmd | getline buffer[CIP]; close(cmd) }
...
{print "CIP,..." >> mysql.infile }' $1

I think the main problem here is the understanding of the awk syntax.

In its simple form, you should understand awk as a line or record parser. The awk language is build out of a set of rules of the form

pattern1 { action1 }
pattern2 { action2 }
...

Which should be interpreted as : if pattern1 is satisfied, perform action 1 on the current line. then continue to pattern2. If no pattern is given, it assumes to be true and the corresponding action is executed.

In the first example above, there are 3 rules.

  • {CIP=$1} states that as a first action put the variable CIP to $1
  • (buffer[CIP]==0) { print ... } states that the second action (print ...) is only performed if buffer[CIP] is zero, i.e. the value is not yet computed.
  • the final rule states print the buffered value.

I hope this helped.

kvantour
  • 25,269
  • 4
  • 47
  • 72