0

I have agent.log file. This file is updating as regular interval.

Entries are as follows 2014-01-07 03:43:35,223 INFO ...some data

I want to extract data of last 3 minutes, Is there any way so that I will get this data using bash script?

Wanna Coffee
  • 2,742
  • 7
  • 40
  • 66

5 Answers5

9

Try the solution below:

awk \
-v start="$(date +"%F %R" --date=@$(expr `date +%s` - 180))" \
-v end="$(date "+%F %R")" \
'$0 ~ start, $0 ~ end' \
agent.log

In the start variable there is the time stamp 3 minutes (180 seconds) before the current time.

In the end there is the current time.

$0 ~ start, $0 ~ end selects the lines between start and end

Vlad.Bachurin
  • 1,340
  • 1
  • 14
  • 22
  • 1
    Wonderfully concise, but won't work if there happen to be no log entries made during the calendar minute 3+ minutes ago (which may be OK for the specific problem at hand); further suggestions: use `'$0 ~ "^"start` to anchor matching at the start of the line; `end` may not be needed, given that the reference point is now (assuming the log file is simply appended to in chronological order). OSX users: the `date` syntax differs; use `-v start="$(date -v'-3M' +'%F %R')"` – mklement0 Jan 14 '14 at 15:55
4

date +"%F %R" gives you the current time down to the minute.

grep '^'"$(date +"%F %R")" agent.log will select the last minute from the file

Now for the previous two minutes it's more tricky... I have developed some scripts that can do complete time manipulation in relative or absolute, and it may be simpler than fiddling with date...

2 minutes ago in the right format: date --date="@$(($(date +"%s") - 2*60))" +"%F %R"

Merge all 3:

NOW=$(date +"%F %R")
M1=$(date --date="@$(($(date +"%s") - 1*60))" +"%F %R")
M2=$(date --date="@$(($(date +"%s") - 2*60))" +"%F %R")
grep '^'"$NOW\|$M1\|$M2" agent.log
mklement0
  • 382,024
  • 64
  • 607
  • 775
dargaud
  • 2,431
  • 2
  • 26
  • 39
  • +1 for also covering the case where no entries are present for one of the calendar minutes covered. OSX users: the date syntax differs; use date -v'-2M' +'%F %R' for '2 minutes ago'. – mklement0 Jan 14 '14 at 16:06
2

my answer considers the followings:

  1. using bash and UNIX/Linux commands
  2. the last log line is the start time not the actual server time
  3. there is no expectation about the lines' date (minutes, days, years, etc.)
  4. the whole script should be expandable to the inverse, or a specified from-to interval

    #!/bin/bash
    # this script expects descending dates in a log file (reverse as real life examples)!!!
    FILE=$1
    INTV=180 # sec
    
    while read LINE
    do    
        if [ -z $LAST_LOG_LINE ]
        then
            # interval stat line
            LAST_LOG_LINE=$(date --date="$( echo "$LINE" | sed -e 's/INFO.*//')" +%s)
            # mod 
            #continue 
        fi
        ACT_LOG_LINE=$(date --date="$( echo "$LINE" | sed -e 's/INFO.*//')" +%s)
        # print line if not greater than $INTV (180s)
        # else break the reading and exit
        if [ $(($LAST_LOG_LINE-$ACT_LOG_LINE)) -gt $INTV ]
        then
            break
        fi
        # actual print
        echo "$LINE"
    done < $FILE
    

    Testing:

    2014-01-07 03:43:35,223 INFO ...some data
    2014-01-07 03:42:35,223 INFO ...some data
    2014-01-07 03:41:35,223 INFO ...some data
    2014-01-07 03:40:35,223 INFO ...some data
    2014-01-07 02:43:35,223 INFO ...some data
    2014-01-07 01:43:35,223 INFO ...some data
    2014-01-06 03:43:35,223 INFO ...some data
    

    $ /tmp/stack.sh /tmp/log 
    2014-01-07 03:42:35,223 INFO ...some data
    2014-01-07 03:41:35,223 INFO ...some data
    2014-01-07 03:40:35,223 INFO ...some data
    $
csikos.balint
  • 1,107
  • 2
  • 10
  • 25
  • Looks like you're assuming that the log file contains the most recent entries *first*, but that's not how log files usually work. Also, it seems that you never output the first line (which is the most recent entry by your logic). – mklement0 Jan 14 '14 at 16:11
  • Hi, you are right about the log file order. The LAST_LOG_LINE has to store the last line of the log (maybe `tail -1` outside of the `while`. But the script prints the first line too. – csikos.balint Jan 14 '14 at 17:39
  • No - you'd have to remove `continue` to also print the first line (see your own example output); that said, I encourage you to revise your answer more thoroughly: (a) revise the date-extraction code to only grab the date itself, e.g.: `cut -d ',' -f 1 <<<"$LINE"` (b) grab the *last* line first, before the loop, with `tail -1`, as you suggest, (c) consistently double-quote variable references, e.g. `"$LINE"`. – mklement0 Jan 14 '14 at 21:54
  • Oh, yes, you are right. I will edit this answer... – csikos.balint Jan 15 '14 at 18:07
0

I think you may be somewhat better off using Python in this case. Even if this script doesn't find a date exactly 3 minutes ago, it will still get any log entries in between the time the script was called and 3 minutes ago. This is both concise and more robust than some of the previous solutions offered.

#!/usr/bin/env python                                                           
from datetime import datetime, timedelta                                        

with open('agent.log') as f:                                                    
    for line in f:                                                              
         logdate = datetime.strptime(line.split(',')[0], '%Y-%m-%d %H:%M:%S')                                                                      
         if logdate >= datetime.now() - timedelta(minutes=3):                   
             print(line) 
benjwadams
  • 1,520
  • 13
  • 16
0

A Ruby solution (tested on ruby 1.9.3)

You can pass days, hours, minutes or seconds as a parameter and it will search for the expression and on the file specified (or directory, in which case it will append '/*' to the name):

In your case just call the script like so: $0 -m 3 "expression" log_file

Note: Also if you know the location of 'ruby' change the shebang (first line of the script), for security reasons.

#! /usr/bin/env ruby

require 'date'
require 'pathname'

if ARGV.length != 4
        $stderr.print "usage: #{$0} -d|-h|-m|-s time expression log_file\n"
        exit 1
end
begin
        total_amount = Integer ARGV[1]
rescue ArgumentError
        $stderr.print "error: parameter 'time' must be an Integer\n"
        $stderr.print "usage: #{$0} -d|-h|-m|-s time expression log_file\n"
end

if ARGV[0] == "-m"
        gap = Rational(60, 86400)
        time_str = "%Y-%m-%d %H:%M"
elsif ARGV[0] == "-s"
        gap = Rational(1, 86400)
        time_str = "%Y-%m-%d %H:%M:%S"
elsif ARGV[0] == "-h"
        gap = Rational(3600, 86400)
        time_str = "%Y-%m-%d %H"
elsif ARGV[0] == "-d"
        time_str = "%Y-%m-%d"
        gap = 1
else
        $stderr.print "usage: #{$0} -d|-h|-m|-s time expression log_file\n"
        exit 1
end

pn = Pathname.new(ARGV[3])
if pn.exist?
        log = (pn.directory?) ? ARGV[3] + "/*" : ARGV[3]
else
        $stderr.print "error: file '" << ARGV[3] << "' does not exist\n"
        $stderr.print "usage: #{$0} -d|-h|-m|-s time expression log_file\n"
end

search_str = ARGV[2]
now = DateTime.now

total_amount.times do
        now -= gap
        system "cat " << log << " | grep '" << now.strftime(time_str) << ".*" << search_str << "'"
end
simi
  • 89
  • 1
  • 5