1

I've a log file that contains some lines I need to grab:

Jul  2 06:42:00 myhostname error proc[12345]: 01310001:3: event code xxxx Slow transactions attack detected - account id: (20), number of dropped slow transactions: (3)
Jul  2 06:51:00 myhostname error proc[12345]: 01310001:3: event code xxxx Slow transactions attack detected - account id: (20), number of dropped slow transactions: (2)

Account id(xx) gives me the name of an object that I am able to gather through mysql query.

Following command (which is for sure not optimized at all, but working) gives me the number of matching lines per account id:

grep "Slow transactions" logfile| awk '{print $18}' | awk -F '[^0-9]+' '{OFS=" ";for(i=1; i<=NF; i++) if ($i != "") print($i)}' | sort | uniq -c
 14 20

The output (14 20) means the account id 20 was observed 14 times (14 lines in the logfile).


Then I also have number of dropped slow transactions: (2) part. This gives the real number of dropped transactions that was logged. In other word, a log entry could mean 1 or more dropped transaction.

I do have a small command to count the number of dropped transactions:

grep "Slow transactions" logfile | awk '{print $24}' | sed 's/(//g' | sed 's/)//g' | awk '{s+=$1} END {print s}'
73

That means 73 transactions were dropped.


These two works but when coming to the point of merging the two I am stuck. I really don't see how to combine them; I am pretty sure awk can do it (and probably a better way that I did) but I would appreciate if any expert from the community could give me some guidance.


update Since above one was too easy for some of our awk experts in SO I introduce an optional feature :)

As previously mentioned I can convert account ID into a name issuing a mysql query. So, the idea is now to include the ID => name conversion into the awk command.

The mySQL query looks like this (XX being the account ID):

 mysql -Bs -u root -p$(perl -MF5::GenUtils -e "print get_mysql_password.qq{\n}") -e "SELECT name FROM myTABLE where account_id= 'XX'"

I founded the post below which deals with commands outputs into awk but facing syntax errors...

How can I pass variables from awk to a shell command?

Community
  • 1
  • 1
Xxmusashi
  • 33
  • 6

1 Answers1

3

This uses parentheses as your field separator, so it's easier to grab the account number and the number of slow connections.

awk -F '[()]' '
    /Slow transactions/ {
        acct[$2]++
        dropped[$2] += $4
    }
    END {
        PROCINFO["sorted_in"] = "@ind_num_asc"     # https://www.gnu.org/software/gawk/manual/html_node/Controlling-Scanning.html

        for (acctnum in acct)
            print acctnum, acct[acctnum], dropped[acctnum]
    }
' logfile

Given your sample input, this outputs

20 2 5

Required GNU awk for the "sorted_in" method of sorting array traversal by index.

glenn jackman
  • 238,783
  • 38
  • 220
  • 352
  • How come you're sorting them, i think OP only did so they could use uniq ? – 123 Jul 02 '15 at 15:04
  • Thanks for the fast reply. I'm getting the two commands into one single output indeed. But this is not exactly what I'm looking for (maybe my description is not clear enough): what I wanted to do is to have the sum of X ("dropped slow transactions: (X) ") per account id (account id: (YY)). – Xxmusashi Jul 02 '15 at 15:24
  • Thank you Glenn. So if I well understand you use a kind of array to store and compute values right? The two lines after the pattern search are a confusing me a bit... I don't really understand them – Xxmusashi Jul 02 '15 at 19:55
  • 1
    my field separators are "(" and ")". Therefore, the 2nd field is the account number and the 4th field is the number of drops. `acct[$2]++` means increment (by 1) the value in the "acct" associative array at index "account number"; and `dropped[$2] += $4` means increment (by the number of drops) the value in the "dropped" associative array at index "account number" – glenn jackman Jul 02 '15 at 21:55
  • Thank you again Glenn, works like a charm! One last think I wanted to do as an extra was to replace the account ID with its name. I have the mysql command to do the mapping. I edited my question so that you have this extra challenge too. – Xxmusashi Jul 02 '15 at 22:57