-1

In Linux it's straightforward :

awk ‘ {print $1}’ logfile | sort | uniq -c | sort -nr | head -n 5

How can I transform the same logic into a Python function? Thanks.

Rup
  • 33,765
  • 9
  • 83
  • 112
Bek
  • 13
  • 1
  • 2
    Please post the code you have tried so far, and read the advice on [how to write a good question on StackOverflow](http://stackoverflow.com/help/how-to-ask). – i alarmed alien Aug 09 '18 at 16:14

2 Answers2

0

You can use subprocess module. In below link you have an answer for your question. https://stackoverflow.com/a/13332300/4257838

Rafał
  • 685
  • 6
  • 13
  • def apache_log_reader(file): myregex = r'\d{,3}\.\d{,3}\.\d{,3}\.\d{,3}' with open("C:/Users/Otabek/Desktop/TestA.txt") as f: log = f.read() my_iplist = re.findall(myregex,log) ipcount = Counter(my_iplist) for k, v in ipcount.items(): print("IP Address " + "=> " + str(k) + " " + "Count " + "=> " + str(v)) #list = [k] #sl = list.sort() #print (list) # Create entry point of our code if __name__ == '__main__': apache_log_reader("C:/Users/Otabek/Desktop/TestA.txt") – Bek Aug 09 '18 at 21:49
0

You can use collections.Counter to count the occurrences of unique IPs and then sort and slice the resulting dict items:

from collections import Counter
from operator import itemgetter
for i, n in sorted(Counter(l.split()[0] for l in open('logfile').readlines()).items(), key=itemgetter(1), reverse=True)[:5]:
    print(n, i)
blhsing
  • 91,368
  • 6
  • 71
  • 106
  • Ran it and got an error: TypeError Traceback (most recent call last) in () 1 from collections import Counter 2 from operator import itemgetter ----> 3 for i, n in sorted(Counter(l.split()[0] for l in open("C:/Users/Otabek/Desktop/TestA.txt").readlines()).items(), key=itemgetter(1), reverse=True)[:5]: 4 print(n, i) TypeError: 'list' object is not callable – Bek Aug 09 '18 at 21:40
  • You may have accidentally overridden one of the built-in functions by assigning to it a different value. Try running the above code in a clean session of iPython. – blhsing Aug 10 '18 at 04:46
  • 1
    Nice. Thanks so much. But it returns the top lines, it doesn't sort them by the number of occurrence. – Bek Aug 10 '18 at 13:51
  • Glad to be of help. Can you mark this answer as accepted if you find it to be correct? – blhsing Aug 10 '18 at 13:51