0

I've been trying to sort and custom print the results from an apache.log file. The situation is that I would like to see the results as:

The output should represent total hits per month sorted by month

The output should look like:

Nov 2017 hits count - 12512

Dec 2017 hits count - 10087

Jan 2018 hits count - 12561

Here is part of the access.log for reference:

91.244.19.43 - - [12/Dec/2015:19:02:36 +0100] "GET / HTTP/1.1" 404 239 "http://localhost/" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.143 Safari/537.36" "-"
91.244.19.43 - - [12/Dec/2015:19:02:36 +0100] "GET /images/ HTTP/1.1" 200 1963 "http://localhost/" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.143 Safari/537.36" "-"
91.244.19.46 - - [12/Dec/2015:19:02:36 +0100] "GET /template/ HTTP/1.1" 200 10004 "http://localhost/" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.143 Safari/537.36" "-"
91.244.19.43 - - [12/Dec/2015:19:02:36 +0100] "GET /wp-login.php HTTP/1.1" 200 1801 "http://localhost/" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.143 Safari/537.36" "-"
193.47.55.21 - - [12/Dec/2015:19:02:36 +0100] "GET /wp-admin/ HTTP/1.1" 200 1457 "http://localhost/" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.143 Safari/537.36" "-"
193.47.55.21 - - [12/Dec/2015:19:02:36 +0100] "GET /template/ HTTP/1.1" 200 3465 "http://localhost/" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.143 Safari/537.36" "-"
11.114.21.37 - - [12/Dec/2015:19:02:36 +0100] "GET /wp-login.php HTTP/1.1" 200 4890 "http://localhost/" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.143 Safari/537.36" "-"

I came up with something like this:

cat access.log |sort -k1n|awk '{print $4}'|cut -d: -f1|cut -d/ -f2-3|sed 's/\[//g'|tr '/' ' '|sort -k2n -k1M

It does the job, printing the year/month and the hit count, but I need the output to be as the example above. In other words, I want to put the "hits counts" between the time frame and the number value of the actual hits count. Any idea how I can do that?

Thank you in advance.

P....
  • 17,421
  • 2
  • 32
  • 52
Ivanovich
  • 9
  • 4

1 Answers1

0
cat access.log | awk '{ print substr($4,5,3),substr($4,9,4) }' | \
 sort -k1  | \
 uniq -c | \
 gawk '{ print $2,$3,"hits count - ",$1}'

First print month and year,

then sort (not really needed),

then count the uniq lines,

then print month,year,"hits count -", and the number counted.

Luuk
  • 12,245
  • 5
  • 22
  • 33
  • Thank you. Works like a charm. Is it possible to print the uniq ip hit counts instead of the total number of hits? – Ivanovich Mar 11 '20 at 21:37
  • Yes, of course this possible. You need to learn the basics of (g)awk, and than you can change this commands yourself. SO (stackoverflow) is not a website where people write code for you..... – Luuk Mar 13 '20 at 18:55
  • 1
    Probably get rid of the [useless `cat`.](/questions/11710552/useless-use-of-cat) – tripleee Dec 01 '20 at 13:13