1

I am new to gawk. Based on this thread, I already made a gawk function to convert datetime such as "07,JUN,2015,06,PM" to unix time (millisecond):

$ cat tst.awk
function cvttime(t, a) {
    split(t,a,/[,: ]+/)

    # fa0,07,DEC,2014,10,AM,862.209018
    #  =>
    #    a[2] = "07"                date
    #    a[3] = "DEC"               month
    #    a[4] = "2014"              year
    #    a[5] = "06"                time
    #    a[6] = "AM"                AM/PM

    if ( (a[6] == "PM") && (a[5] < 12) ) {
        a[5] += 12
    }

    match("JANFEBMARAPRMAYJUNJULAUGSEPOCTNOVDEC",a[3])
    a[3] = (RSTART+2)/3

    return( a[1]","mktime(a[4]" "a[3]" "a[2]" "a[5]" 00 0")"000,"a[7])
}

BEGIN {
    mdt = "fa0,07,DEC,2014,10,AM,862.209018"    
    ms = cvttime(mdt)
    print ms
}

In terminal, the following command gives me the correct unix time:

$ TZ=UTC gawk -f tst.awk

returns:

fa0,1417946400000,862.209018

Now, I have a file "input.csv" containing:

aa1,07,DEC,2014,06,AM,282.485988
ac3,07,DEC,2014,07,AM,97.6757181
ef3,07,DEC,2014,08,AM,112.816554
ag3,07,DEC,2014,09,AM,101.479961
fa0,07,DEC,2014,10,AM,862.209018

How should I modified the gawk function and the shell command to pass in "input.csv" and generate "output.csv" containing:

aa1,1417932000000,282.485988
ac3,1417935600000,97.6757181
ef3,1417939200000,112.816554
ag3,1417942800000,101.479961
fa0,1417946400000,862.209018

Thanks in advance!

Community
  • 1
  • 1
lys1030
  • 283
  • 1
  • 5
  • 17
  • While this might not be of great help, I can see you're going to run into some problems with having 07, DEC, 14, 10, and AM all separated by commas. You're going to have to take them all individually and then string them from there. Do something along the lines of reading "columns" 2-6, then arranging them in a string and using this command - `awk -F";" '{system("date -d \""$6"\" '+%s'")}' file` where you'd substitute $6 with the proper string. – Chirality Aug 01 '15 at 00:21
  • This ugly script I through together will do it ``cat input.csv | sed -r 's@([a-z0-9]*),([0-9]*),([A-Z]*),([0-9]*),([0-9]*),([A-Z]*),([0-9.]*)@echo "\1,`TZ=UTC date --date "\3 \2, \4 \5:00:00 \6" +%s`,\7"@e' > output.csv `` – Jeremiah Dicharry Aug 01 '15 at 01:12
  • Only GNU sed though. – Jeremiah Dicharry Aug 01 '15 at 01:13
  • Don't forget that 12 AM needs to be mapped to 0 — see [Convert 12-hour date/time to 24 hour date/time](http://stackoverflow.com/questions/440061) and a load of other questions. You may also find some useful ideas in [How to do date-time calculation and formatting in an `awk` script?](http://stackoverflow.com/questions/31602847) – Jonathan Leffler Aug 01 '15 at 01:32

2 Answers2

1

You could use something like this:

awk -F, '{"date -d "$3"\" \""$2"\" \""$5"\" \""$6"\" \""$4" '+%s'" | getline d; print $1","d"000,"$NF""}' input.csv > output.csv

or

awk -F, '{"date -u -d "$3"\" \""$2"\" \""$5"\" \""$6"\" \""$4" '+%s'" | getline d; print $1","d"000,"$NF""}' input.csv > output.csv

If you want set the flag -u from date command

-u, --utc, --universal print or set Coordinated Universal Time

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
Davison
  • 71
  • 3
0

Well this is confusing because your input times do not match your output times, but I think this does what you want:

BEGIN {
  FS = OFS = ","
}
{
  # fix year
  $4 += 2000
  # fix month
  match("JANFEBMARAPRMAYJUNJULAUGSEPOCTNOVDEC", $3)
  $3 = (RSTART + 2) / 3
  # fix hour
  if ($6 == "PM" && $5 < 12)
    $5 += 12
  print $1, mktime($4 " " $3 " " $2 " " $5 " 0 0") * 1000, $NF
}
Zombo
  • 1
  • 62
  • 391
  • 407