0

i have a huge log file in that format:

202.32.92.47,01/Jun/1995:00:00:59,/~scottp/publish.html,200,271
ix-or7-27.ix.netcom.com,01/Jun/1995:00:02:51,/~ladd/ostriches.html,200,205908
...

I need to calculate the difference in seconds between two lines from the first one to current. The second column is in format like this:

dd/month/year:HH:MM:SS

I can change it in vim using command:

:%s/\/Jun\//\:Jun\:/g

then i get:

fromkin.lib.uwm.edu,01:Jun:1995:11:58:03,/~scottp/publish.html,200,271
slip1.ac.brocku.ca,01:Jun:1995:11:58:03,/cgi-bin/hytelnet?file=DIR000,200,7748
bertram.hallf.lth.se,01:Jun:1995:11:58:06,/~macphed/finite/fe_resources/node92.html,200,1668

in format:

dd:month:year:HH:MM:SS

Is there any way to do it in shell scripts / awk ?

My expecting output is:

fromkin.lib.uwm.edu,01:Jun:1995:11:58:03,/~scottp/publish.html,200,271
slip1.ac.brocku.ca,0,/cgi-bin/hytelnet?file=DIR000,200,7748
bertram.hallf.lth.se,3,/~macphed/finite/fe_resources/node92.html,200,1668
Marcin Erbel
  • 1,597
  • 6
  • 32
  • 51
  • @fedorqui although it is finding the difference between two lines, the question you linked does not include comparing the first and current lines. –  Nov 03 '14 at 12:37
  • and also doesn't include date and time in one column with different delimiters... – Marcin Erbel Nov 03 '14 at 12:41
  • I think [How to calculate time difference in bash script?](http://stackoverflow.com/q/8903239/) kind of solves this question. But no problem, reopening it. – fedorqui Nov 03 '14 at 13:11

1 Answers1

1

It's not clear what your expected output should be since the sample output you posted does not match the input you posted but to diff 2 timestamps in the posted sample input file and print the number of seconds between the timestamp in the first and all subsequent lines would be (using GNU awk for time functions):

$ cat file
202.32.92.47,01/Jun/1995:00:00:59,/~scottp/publish.html,200,271
ix-or7-27.ix.netcom.com,01/Jun/1995:00:02:51,/~ladd/ostriches.html,200,205908
fromkin.lib.uwm.edu,01/Jun/1995:11:58:03,/~scottp/publish.html,200,271
slip1.ac.brocku.ca,01/Jun/1995:11:58:03,/cgi-bin/hytelnet?file=DIR000,200,7748
bertram.hallf.lth.se,01/Jun/1995:11:58:06,/~macphed/finite/fe_resources/node92.html,200,1668

.

$ cat tst.awk
BEGIN{ FS=OFS="," }
{
    split($2,t,/[\/:]/)
    mthNr = (match("JanFebMarAprMayJunJulAugSepOctNovDec",t[2])+2)/3
    currSecs = mktime(t[3]" "mthNr" "t[1]" "t[4]" "t[5]" "t[6])

    if (NR == 1) {
        baseSecs = currSecs
    }
    else {
        $2 = currSecs - baseSecs
    }
    print
}

.

$ awk -f tst.awk file
202.32.92.47,01/Jun/1995:00:00:59,/~scottp/publish.html,200,271
ix-or7-27.ix.netcom.com,112,/~ladd/ostriches.html,200,205908
fromkin.lib.uwm.edu,43024,/~scottp/publish.html,200,271
slip1.ac.brocku.ca,43024,/cgi-bin/hytelnet?file=DIR000,200,7748
bertram.hallf.lth.se,43027,/~macphed/finite/fe_resources/node92.html,200,1668
Ed Morton
  • 188,023
  • 17
  • 78
  • 185
  • I would like to have output like: fromkin.lib.uwm.edu,3,/~scottp/publish.html,200,271 So Im expecting time in seconds – Marcin Erbel Nov 04 '14 at 09:31
  • I updated my answer to take another guess at what you want. Getting the time in seconds is trivial, its just not clear where all the other information in your posted expected output is coming from - (the first line, the current line, the previous line, some other file, somewhere else) or whether the time diff should be between successive lines or between the first line and the current line or something else. It's important when posting questions to show specific sample input and the specific output you expect from THAT input. – Ed Morton Nov 04 '14 at 13:59
  • I definitely should. I fixed it and put specific output which I expect. But the answer is mostly correct. Only first row should be without changing – Marcin Erbel Nov 04 '14 at 14:23
  • 1
    I modified the script to do that. Just moved the "print" out of the "else" block. – Ed Morton Nov 04 '14 at 14:42