How to grep log files during a specific time period

Question

Okay, So i have log files and I would like to search within specific ranges. These ranges will be different throughout the day. Below is a piece of a log file and this is the only piece I can show you, sorry work stuff. I am using the cat command if that matters.

Working EXAMPLE : cat /dir/dir/dir/2014-07-30.txt | grep *someword* | cut -d',' -f1,4,3,7

2014-07-30 19:17:34.542 ;; (p=0,siso=0)

The above gets me the info I need along with the time stamp, but shows all time ranges and that is what I would like to correct. Lets say I only want ranges of 18 to 20 in the first column of the time.

Actual --> 2014-07-30 19:17:34.542 ;; (p=0,siso=0)

Only range I am looking for --> [18-20]:00:00.000 ;; (p=0,siso=0)

I am not worried about the 00s as they can be any digit.

Thanks for looking. I have not used much in the way of scripting as you can tell from my example, but any help is greatly appreciated.

I have included a log file, the colons and commas are where they should be.

2014-07-30 14:33:19.259 ;; (p=0,ser=0,siso=0) IN ### Word:Numbers=00000,word=None something goes here and here (something here andhere:here also here:2222),codeword=8,codeword=0,Noideanumbers=00000000,something=something, ;;

That's a useless use of cat for the record. `grep '*someword*' /dir/dir/dir/2014-07-30.txt` does the same thing without the extra process and pipe. — Etan Reisner, Jul 31 '14 at 02:18
It sure does, but I use the pipes and the extra process because I need certain pieces of info from the log file. I realize and know I can do grep in front of it. thanks for your input.. — ZeroLoop, Jul 31 '14 at 02:38
I don't follow. The cat in that pipeline doesn't do anything at all for you. It can't (except stop grep from knowing that you are reading from a file and what the filename is). — Etan Reisner, Jul 31 '14 at 02:44
Well if I use your command with grep in the front with my pipes and delimits I get the same info but with the directory info at the front where as with mine I get only the info I need without the extra directory jargon. We search through log files in hundreds of directories at a time and only need key info. — ZeroLoop, Jul 31 '14 at 02:56
Are you talking about the filename prefix (`/path/to/file:`) that grep puts on output lines when fed more than one file? Because `-h` turns that off. — Etan Reisner, Jul 31 '14 at 03:15
Im really new to linux, did not know that but it does the same thing so i will try them both and let u know — ZeroLoop, Jul 31 '14 at 04:38

score 2 · Answer 1 · answered Jul 31 '14 at 02:26

Using awk:

logsearch() {
    grep "$3" "$4" | awk -v start="$1" -v end="$2" '{split($2, a, /:/)} (a[1] >= start) && (a[1] <= end)'
}

# logsearch <START> <END> <PATTERN> <FILE>
logsearch 18 20 '*someword*' /dir/dir/dir/2014-07-30.txt

Or with only awk (possibly different pattern quoting requirements):

logsearch2 ()
{
    awk -v start="$1" -v end="$2" -v pat="$3" '($0 ~ pat) {split($2, a, /:/)} ($0 ~ pat) && (a[1] >= start) && (a[1] <= end)' "$4"
}

tink · Answer 2 · 2014-07-31T03:51:55.613

1

Not having seen the original input data I'm guessing from your cut what's going on.

Will this give you something similar to your desired outcome?

 awk -F, '/someword/ && $4 ~ /^(18|19|20)/{printf "%s %s %s %s\n", $1,$4,$3,$7}' /dir/dir/dir/2014-07-30.txt

That said: a bit of sample data typically goes a long way!

Edit1:

Given the input line you added to both your comment and the original post the following awk statement does what you're asking:

awk '/something/ && $2 ~ /^(18|19|20)/{printf "%s %s %s %s\n", $1,$2,$3,$4} /path/to/your/input_file

edited Jul 31 '14 at 03:51

answered Jul 31 '14 at 02:47

tink

14,342
4
46
50

I will see if I can create something that will help a little better – ZeroLoop Jul 31 '14 at 03:00
This is a sample and the colons and commas are where they should be. 2014-07-30 14:33:19.259 ;; (p=0,ser=0,siso=0) IN ### Word:Numbers=000000000000,word=None something goes here and here (something here andhere:here also here:2222),codeword=8,codeword=0,Noideanumbers=00000000,something=something, ;; – ZeroLoop Jul 31 '14 at 03:05
Hmmm ... with that input your cut leaves the line intact. I still don't know what you're doing. Unless your commas are something other than what you pasted. – tink Jul 31 '14 at 03:30
The cut in the OP modifies that example line. It doesn't drop much from the line but it does drop a little bit. – Etan Reisner Jul 31 '14 at 04:51

score 1 · Answer 3 · answered Jul 31 '14 at 06:26

This is a very interesting question. The pure BASH solution offers quite a bit of flexibility in how you deal with or process the entries after you identify those responsive to the range of date/time of interest. The simplest way in BASH is simply to get your start-time and stop-time in seconds since epoch and then test each log entry to determine if it falls within that range and then -- do something with the log entry. The basic logic involved is relatively short. The width of the date_time field within the log can be set by passing the width as argument 4. Set the default dwidth as needed (currently 15 to match syslog and journalctl format. The only required argument is the logfile name. If no start/stop time is specified, it will find all entries:

## set filename, set start time and stop time (in seconds since epoch) 
#  and time_field width (number of chars that make up date in log entry)
lfname=${1}
test -n "$2" && starttm=`date --date "$2" +%s` || starttm=0
test -n "$3" && stoptm=`date --date "$3" +%s`  ||  stoptm=${3:-`date --date "Jan 01 2037 00:01:00" +%s`}
dwidth=${4:-15}

## read each line from the log file and act on only those with
#  date_time between starttm and stoptm (inclusive)
while IFS=$'\n' read line || test -n "$line"; do

    test "${line:0:1}" != - || continue           # exclude journalctl first line
    logtm=`date --date "${line:0:$dwidth}" +%s`   # get logtime from entry in seconds since epoch

    if test $logtm -ge $starttm && test $logtm -le $stoptm ; then
        echo "logtm: ${line:0:$dwidth} => $logtm"
    fi

done < "${lfname}"

working example:

#!/bin/bash

## log date format      len
#   journalctl          15
#   syslog              15
#   your log example    23

function usage {
    test -n "$1" && printf "\n Error: %s\n" "$1"
    printf "\n  usage  : %s logfile ['start datetime' 'stop datetime' tmfield_width]\n\n" "${0//*\//}"
    printf "  example: ./date-time-diff.sh syslog \"Jul 31 00:15:02\" \"Jul 31 00:18:30\"\n\n"
    exit 1
}

## test for required input & respond to help
test -n "$1" || usage "insufficient input."
test "$1" = "-h" || test "$1" = "--help" && usage

## set filename, set start time and stop time (in seconds since epoch) 
#  and time_field width (number of chars that make up date in log entry)
lfname=${1}
test -n "$2" && starttm=`date --date "$2" +%s` || starttm=0
test -n "$3" && stoptm=`date --date "$3" +%s`  ||  stoptm=${3:-`date --date "Jan 01 2037 00:01:00" +%s`}
dwidth=${4:-15}

## read each line from the log file and act on only those with
#  date_time between starttm and stoptm (inclusive)
while IFS=$'\n' read line || test -n "$line"; do

    test "${line:0:1}" != - || continue           # exclude journalctl first line
    logtm=`date --date "${line:0:$dwidth}" +%s`   # get logtime from entry in seconds since epoch

    if test $logtm -ge $starttm && test $logtm -le $stoptm ; then
        echo "logtm: ${line:0:$dwidth} => $logtm"
    fi

done < "${lfname}"

exit 0

usage:

$ ./date-time-diff.sh -h

  usage  : date-time-diff.sh logfile ['start datetime' 'stop datetime' tmfield_width]

  example: ./date-time-diff.sh syslog "Jul 31 00:15:02" "Jul 31 00:18:30"

Remember to quote your starttm and stoptm strings. Testing with 20 entries in logfile between Jul 31 00:12:58 and Jul 31 00:21:10.

test output:

$ ./date-time-diff.sh jc.log "Jul 31 00:15:02" "Jul 31 00:18:30"
logtm: Jul 31 00:15:02 => 1406783702
logtm: Jul 31 00:15:10 => 1406783710
logtm: Jul 31 00:15:11 => 1406783711
logtm: Jul 31 00:15:11 => 1406783711
logtm: Jul 31 00:15:11 => 1406783711
logtm: Jul 31 00:15:11 => 1406783711
logtm: Jul 31 00:18:30 => 1406783910

Depending on what you need, another one of the solutions may fit your needs, but if you need to be able to process or manipulate the matching log entries, it is hard to beat a BASH script.

score 0 · Answer 4 · answered Jul 31 '14 at 02:07

0

You can pipe the results to grep again.

cat /dir/dir/dir/2014-07-30.txt | grep someword | cut -d',' -f1,4,3,7 \
    | grep '^\d\d\d\d-\d\d-\d\d \(1[89]\|20\)'

answered Jul 31 '14 at 02:07

minopret

4,726
21
34

No such file or directory error is returned. – ZeroLoop Jul 31 '14 at 02:57
3

That's a wacky thing to say. The only file or directory is exactly as in your question. – tripleee Jul 31 '14 at 06:32

score -1 · Answer 5 · answered Jul 31 '14 at 02:24

-1

I don't have enough reputation to comment, but as minopret suggested do one grep at a time.

Here is one of the solutions to get the 18-20 range:

grep ' 20: \| 17:\| 18:' filename.txt

answered Jul 31 '14 at 02:24

zeroRooter

100
1
6

I can't do one grep at a time as the log file contains info that needs to be together on the same line. Thanks. – ZeroLoop Jul 31 '14 at 02:58

score -1 · Accepted Answer · edited Sep 13 '16 at 19:44

-1

I have found the answer in the form I was looking for:

cat /dir/dir/dir/2014-07-30.txt | grep *someword* | cut -d',' -f1,4,3,7 | egrep '[^ ]+ (2[0-2]):[0-9]'

The following command gets me all the information I need from the cut, and greps for the someword I need and with the egrep I can search the times I need.

edited Sep 13 '16 at 19:44

Kara

6,115
16
50
57

answered Aug 02 '14 at 00:43

ZeroLoop

95
1
2
7

How to grep log files during a specific time period

6 Answers6