I have a Very large (13 GiB) csv file (3856321 rows and 1698) where as expected some of the dates are differently formated. The file looks like ::
2013/01/08 2:11:30 AM,abdc,good time ...
2015/12/28 8:19:30 PM,abdc,good time ...
2/15/2016 10:46:30 AM,kdafh,almost as good ...
12/13/2014 10:46:00 PM,asjhdk,not that good ...
02-Jan-2014,bad time,good time ...
1/1/2015,nomiss time,boy ...
10/15/2016 17:08:30,bad,boy ...
I want to convert it to a same time format and required output is ::
1/8/2013 2:11:30,abdc,good time
12/28/2015 20:19:30,abdc,good time
2/15/2016 10:46:30,kdafh,almost as good
12/13/2014 22:46:00,asjhdk,not that good
1/2/2014 00:00:00,bad time,good time
1/1/2015 00:00:00,nomiss time,boy
10/15/2016 17:08:30,bad,boy
I managed to format the time using the following scripts
awk -F ',' 'BEGIN{FS=OFS=","}{split($1,a," ");
if(a[3]=="PM")
{ split(a[2],b,":");
b[1]=b[1]+12
a[2]=b[1]":"b[2]":"b[3]
};
if(a[2]=="")
{
a[2]="00:00:00"
}
tmp=a[1];
# tmp2=system("date -d `tmp` +%m/%d/%Y");
# print tmp2
$1=tmp" "a[2]
}1' time_input.csv
I borrowed the idea of formatting dates from question https://unix.stackexchange.com/questions/177888/how-to-convert-date-format-in-file which is commented out in the second last line. However, this does not work in my case. I get an error
date: invalid date ‘+%m/%d/%Y’
Is there an easier and better way to do this? Thanks in advance