I am looking to parse Microsoft DNS debugging log responses. The idea is to parse the domains and print a list of the number each domain occurs in the debug log. Typically I would use something like grep -v " R " log > tmp
to first redirect all of the responses to a file. Then manually grep for domains like grep domain tmp
. I assume there is a better way.
20140416 01:38:52 588 PACKET 02030850 UDP Rcv 192.168.0.10 2659 R Q [8281 DR SERVFAIL] A (11)quad(3)sub(7)domain(3)com(0)
20140416 01:38:52 588 PACKET 02396370 UDP Rcv 192.168.0.5 b297 R Q [8281 DR SERVFAIL] A (3)pk(3)sub(7)domain(3)com(0)
20140415 19:46:24 544 PACKET 0261F580 UDP Snd 192.168.0.2 795a Q [0000 NOERROR] A (11)tertiary(7)domain(3)com(0)
20140415 19:46:24 544 PACKET 01A47E60 UDP Snd 192.168.0.1 f4e2 Q [0001 D NOERROR] A (11)quad(3)sub(7)domain(3)net(0)
For the above data, something like the following output would be great:
domain.com 3
domain.net 1
This would indicate that the script or command found two query entries for domain.com. I am not concerned about tertiary or greater hosts being included in the calculation. A shell command or Python would be fine. Here's some pseudo code to hopefully drive the question home.
theFile = open('log','r')
FILE = theFile.readlines()
theFile.close()
printList = []
# search for unique queries and count them
for line in FILE:
if ('query for the " Q " field' in line):
# store until count for this uniq value is complete
printList.append(line)
for item in printList:
print item # print the summary which is a number of unique domains