You want to count lines based on a certain value in a line. That's a good job for awk. With grep-only, you would always have to process the input files once per day. In any way, we need to fix your regex first:
zgrep -E "[01\-31]/Jul/2021:[08\-16]" localhost_access.log* | wc -l
[08\-16]
matches the characters 0
, 8
, -
, 1
and 6
. What you want to match is (0[89])|(1[0-6])
; that is 0
, followed by one of 8
or 9
- or - 1
followed by one of range 0-6
. To make it easier, we assume normal days in the date and therefore match the day with [0-9]{2}
(two digits).
Here's a complete awk for your task:
awk -F/ '/[0-9]{2}\/Jul\/2021:(0[89])|(1[0-6])/{a[$1]++}END{for (i in a) print "day " i ": " a[i]}' localhost_access.log*
Explanation:
/[0-9]{2}\/Jul\/2021:(0[89])|(1[0-6])/
matches date + time for every day (at 08-16) in july
{a[$1]++}
builds an array with key=day and a counter of occurrences.
END{for (i in a) print "day " i ": " a[i]}
prints the array when all input files were processed
Because we've set the field separator to /
, you need to change a[$1]
to address the correct position (for two more slashes before the actual date: a[$3]
). (Of course this can be solved in a more dynamic way.)
Example:
$ cat localhost_access.log
01/Jul/2021:08 log message
01/Jul/2021:08 log message
02/Jul/2021:08 log message
02/Jul/2021:07 log message
$ awk -F/ '/[0-9]{2}\/Jul\/2021:(0[89])|(1[0-6])/{a[$1]++}END{for (i in a) print "day " i ": " a[i]}' localhost_access.log*
day 01: 2
day 02: 1
Run zcat | awk
in case your log files are compressed, but remember the regex above searches for "Jul/2021" only.