How to change the date format of the whole column?

Question

I am analyzing the .csv file and in this my first column is of the datetime in the format "2016-09-15T00:00:13" and I want to change this format to standard python datetime object.I can change the format for one but date but for whole column I can not do that.

My code that I am using:

import numpy
import dateutil.parser
mydate = dateutil.parser.parse(numpy.mydata[1:,0])
print(mydate)

I am getting the error:

'module' object has no attribute 'mydata'

Here is the column for which I want the format to be changed.

print(mydata[1:,0])

['2016-09-15T00:00:13' 

'2016-09-15T00:00:38' 

'2016-09-15T00:00:53' 

...,

'2016-09-15T23:59:28' 

'2016-09-15T23:59:37' 

'2016-09-15T23:59:52']

Did you try using `pandas` data-frames? – Sreejith Menon Nov 02 '16 at 05:16 — Sreejith Menon, Nov 02 '16 at 05:16
Look into making the column `np.datetime64`. – hpaulj Nov 02 '16 at 05:59 — hpaulj, Nov 02 '16 at 05:59

staples · Accepted Answer · 2016-11-02T05:39:27.380

from datetime import datetime

for date in mydata:
  date_object = datetime.strptime(date, '%Y-%m-%dT%H:%M:%S')

Here's a link to the method I'm using. That same link also lists the format arguments.

Oh and about the

'module' object has no attribute 'mydata'

You call numpy.mydata which is a reference to the "mydata" attribute of the numpy module you imported. The problem is, is that "mydata" is just one of your variables, not something included with numpy.

score 1 · Answer 2 · answered Nov 02 '16 at 08:04

Unless you have a compelling reason to avoid it, pandas is the way to go with this kind of analysis. You can simply do

import pandas
df = pandas.read_csv('myfile.csv', parse_dates=True)

This will assume the first column is the index column and parse dates in it. This is probably what you want.

hpaulj · Answer 3 · 2016-11-02T08:08:46.403

Assuming you've dealt with that numpy.mydata[1:,0] attribute error

Your data looks like:

In [268]: mydata=['2016-09-15T00:00:13' ,
     ...: '2016-09-15T00:00:38' ,
     ...: '2016-09-15T00:00:53' ,
     ...: '2016-09-15T23:59:28' ,
     ...: '2016-09-15T23:59:37' ,
     ...: '2016-09-15T23:59:52']

or in array form it is a ld array of strings

In [269]: mydata=np.array(mydata)
In [270]: mydata
Out[270]: 
array(['2016-09-15T00:00:13', '2016-09-15T00:00:38', '2016-09-15T00:00:53',
       '2016-09-15T23:59:28', '2016-09-15T23:59:37', '2016-09-15T23:59:52'], 
      dtype='<U19')

numpy has a version of datetime that stores as a 64 bit float, and can be used numerically. Your dates readily convert to that with astype (your format is standard):

In [271]: mydata.astype(np.datetime64)
Out[271]: 
array(['2016-09-15T00:00:13', '2016-09-15T00:00:38', '2016-09-15T00:00:53',
       '2016-09-15T23:59:28', '2016-09-15T23:59:37', '2016-09-15T23:59:52'], 
       dtype='datetime64[s]')

tolist converts this array to a list - and the dates to datetime objects:

In [274]: D.tolist()
Out[274]: 
[datetime.datetime(2016, 9, 15, 0, 0, 13),
 datetime.datetime(2016, 9, 15, 0, 0, 38),
 datetime.datetime(2016, 9, 15, 0, 0, 53),
 datetime.datetime(2016, 9, 15, 23, 59, 28),
 datetime.datetime(2016, 9, 15, 23, 59, 37),
 datetime.datetime(2016, 9, 15, 23, 59, 52)]

which could be turned back into an array of dtype object:

In [275]: np.array(D.tolist())
Out[275]: 
array([datetime.datetime(2016, 9, 15, 0, 0, 13),
       datetime.datetime(2016, 9, 15, 0, 0, 38),
       datetime.datetime(2016, 9, 15, 0, 0, 53),
       datetime.datetime(2016, 9, 15, 23, 59, 28),
       datetime.datetime(2016, 9, 15, 23, 59, 37),
       datetime.datetime(2016, 9, 15, 23, 59, 52)], dtype=object)

These objects couldn't be used in array calculations. The list would be just as useful.

If your string format wasn't standard you'd have to use the datetime parser in a list comprehension as @staples shows.

How to change the date format of the whole column?

3 Answers3