6

Looking to convert human readable timestamps to epoch/Unix time within a CSV file using GAWK in preparation for loading into a MySQL DB.

Data Example:

{null};2013-11-26;Text & Device;Location;/file/path/to/;Tuesday, November 26 12:17 PM;1;1385845647

Looking to take column 6, Tuesday, November 26 12:17 PM, and convert to epoch time for storage. All times shown will be in EST format. I realize AWK is the tool for this, but can't quite seem to structure the command. Currently have:

cat FILE_IN.CSV | awk 'BEGIN {FS=OFS=";"}{$6=strftime("%s")} {print}' 

However this returns:

{null};2013-11-26;Text & Device;Location;/file/path/to/;1385848848;1;1385845647

Presumably, this means I'm calling the current epoch time (1385848848 was current epoch at time of execution) and not asking strftime to convert the string; but I can't imagine another way to doing this.

What is the proper syntax for gawk/strftime to convert an existing timestamp to epoch?

Edit: This question seems loosely related to How do I use output from awk in another command?

Community
  • 1
  • 1
Keiron
  • 117
  • 1
  • 8
  • `strftime` is for formatting a time that's already in epoch timestamp form; it doesn't parse other formats. For that you would use `strptime`, but that's not available from Gawk. Might be time to reach for Perl. – Mark Reed Nov 30 '13 at 22:17
  • Unfortunately stuck on a corporate system that limits me to BASH only. – Keiron Nov 30 '13 at 22:21
  • Confused. If you can call `gawk`, you can call `perl`... – Mark Reed Nov 30 '13 at 22:22
  • I can call perl, but sysadmin requires all scripting be done only in bash. I realize its silly, but not in a position argue. – Keiron Nov 30 '13 at 22:26
  • `perl -ae 'code goes here'` is just as much a bash script as `awk 'code goes here'`. – Mark Reed Nov 30 '13 at 22:55
  • I agree with [almost] every point (except the crazy idea that perl might be useful) :-). gawk is the right solution BUT gawk is not bash. awk is a standard UNIX tool (unlike perl) but gawk is not awk either in as much as gawk doesn't come with every UNIX installation, just some version of awk does. – Ed Morton Dec 01 '13 at 02:58

2 Answers2

6
$ cat file
{null};2013-11-26;Text & Device;Location;/file/path/to/;Tuesday, November 26 12:17 PM;1;1385845647

$ gawk 'BEGIN{FS=OFS=";"} {gsub(/-/," ",$2); $2=mktime($2" 0 0 0")}1' file
{null};1385445600;Text & Device;Location;/file/path/to/;Tuesday, November 26 12:17 PM;1;1385845647

Here's how to generally convert a date from any format to seconds since the epoch using your current format as an example and with comments to show the conversion process step by step:

$ cat tst.awk
function cvttime(t,     a) {
    split(t,a,/[,: ]+/)
    # 2013 Tuesday, November 26 10:17 PM
    #  =>
    #    a[1] = "2013"
    #    a[2] = "Tuesday"
    #    a[3] = "November"
    #    a[4] = "26"
    #    a[5] = "10"
    #    a[6] = "17"
    #    a[7] = "PM"

    if ( (a[7] == "PM") && (a[5] < 12) ) {
        a[5] += 12
    }
    # => a[5] = "22"

    a[3] = substr(a[3],1,3)
    # => a[3] = "Nov"

    match("JanFebMarAprMayJunJulAugSepOctNovDec",a[3])
    a[3] = (RSTART+2)/3
    # => a[3] = 11

    return( mktime(a[1]" "a[3]" "a[4]" "a[5]" "a[6]" 0") )
}

BEGIN {
    mdt ="Tuesday, November 26 10:17 PM"
    secs = cvttime(2013" "mdt)
    dt = strftime("%Y-%m-%d %H:%M:%S",secs)
    print mdt ORS "\t-> " secs ORS "\t\t-> " dt
}
$ awk -f tst.awk
Tuesday, November 26 10:17 PM
        -> 1385525820
                -> 2013-11-26 22:17:00

I'm sure you can modify that for the current problem.

Also, if you don't have gawk you can write the cvttime() function as (borrowing @sputnik's date command string):

$ cat tst2.awk
function cvttime(t,     cmd,secs) {
    cmd = "date -d \"" t "\" '+%s'"
    cmd | getline secs
    close(cmd)
    return secs
}

BEGIN {
    mdt ="Tuesday, November 26 10:17 PM"
    secs = cvttime(mdt)
    dt = strftime("%Y-%m-%d %H:%M:%S",secs)
    print mdt ORS "\t-> " secs ORS "\t\t-> " dt
}
$
$ awk -f tst2.awk
Tuesday, November 26 10:17 PM
        -> 1385525820
                -> 2013-11-26 22:17:00

I left srtftime() in there just to show that the secs was correct - replace with date as you see fit.

For the non-gawk version, you just need to figure out how to get the year into the input month/date/time string in a way that date understands if that maters to you - shouldn't be hard.

Ed Morton
  • 188,023
  • 17
  • 78
  • 185
  • This just about answers the question, but was looking to convert column $6 instead of $2. Difference being the format; "Tuesday, November 26 12:17 PM". Very helpful for getting me there, been playing around with options. – Keiron Dec 01 '13 at 05:08
  • Ah, I didn't notice that. Since that field doesn't have a year - can we use the date from $2 with the time from $6 or how else should we determine the year? – Ed Morton Dec 01 '13 at 13:26
  • I've updated my answer to show in general how to convert from any date format to seconds since the epoch - you can just tweak to provide the year however you see fit. – Ed Morton Dec 01 '13 at 14:09
  • Yes, fair to assume year in $2 will be the same and thank you very much for the help! – Keiron Dec 01 '13 at 14:25
  • You're welcome, thanks for accepting the answer, and I just added a non-gawk version of the cvttime() function in case you find that your sys admin doesn't consider gawk to be acceptable after all. – Ed Morton Dec 01 '13 at 14:30
2

You can convert date to epoch with this snippet :

$ date -d 'Tuesday, November 26 12:17 PM' +%s
1385464620

So finally :

awk -F";" '{system("date -d \""$6"\" '+%s'")}' file

Thanks @Keiron for the snippet.

Gilles Quénot
  • 173,512
  • 41
  • 224
  • 223
  • Was trying precisely this earlier, but couldn't think of a way to call the command for each line of the CSV. Is there a way to embed something like: date -d $6 '+%s' – Keiron Nov 30 '13 at 22:20
  • This was the code I was using in that approach: `cat FILE_IN.csv | awk -F";" '{system("date -d "$6" '+%s'")}'` – Keiron Nov 30 '13 at 23:21
  • Found the error in the syntax `awk -F";" '{system("date -d \""$6"\" '+%s'")}'` – Keiron Dec 01 '13 at 04:28