Replacing a pandas dataframe column

Question

I have some code which is pulling data into a pandas dataframe.

df = df[["date","high","low","open","close"]]

The data looks like this:

       date         high          low        open       close
0  1499140800  2615.390000  2600.000000  2610.58000  2603.00000
1  1499155200  2618.785462  2582.002100  2615.00000  2607.00000
2  1499169600  2617.674406  2550.314214  2607.00000  2554.55010
3  1499184000  2572.962113  2519.967440  2554.55010  2519.96744
4  1499198400  2595.654622  2516.000000  2519.96744  2580.00000

I am wanting to convert the unix timestamp column with an actual datetime. Earlier in my code am having to import

from datetime import datetime

and it seems to be giving me an error when I try to use datetime.datetime to convert. I can't just use import datetimebecause this gives me an error on my previous import which I need. So my question is twofold. How can I convert the timestamp into a datetime (possibly without using the datetime module?) and how can I overwrite the date column with the new date time before I save into a file?I have tried several approaches but none seem to be giving me any luck..

That's a great little solution and I am all for using one liners. Really nice thank you. However this seems to give me the output "1970-01-01 00:00:01.499155200"? which still has the stamp somehow attached? Also you don't answer how I can now replace my date column and inject it into the dataframe before I save it down! — Jo3Dirt, Jul 06 '17 at 04:00
I recommend you take another look at @MaxU's answer. Its vectorized and at least deserves your up-vote. — piRSquared, Jul 06 '17 at 05:35

ctj232 · Accepted Answer · 2017-07-06T04:15:32.517

This code worked for me after I copied your example data to a file in.csv

An explanation follows the code example.

>>> import pandas as pd

>>> df = pd.read_csv('in.csv')

>>> df
     date         high          low        open       close
0  1499140800  2615.390000  2600.000000  2610.58000  2603.00000
1  1499155200  2618.785462  2582.002100  2615.00000  2607.00000
2  1499169600  2617.674406  2550.314214  2607.00000  2554.55010
3  1499184000  2572.962113  2519.967440  2554.55010  2519.96744
4  1499198400  2595.654622  2516.000000  2519.96744  2580.00000

>>> from datetime import datetime

>>> converted = df['date'].apply(datetime.fromtimestamp)

>>> df['date'] = converted

>>> df
              date         high          low        open       close
0 2017-07-03 22:00:00  2615.390000  2600.000000  2610.58000  2603.00000
1 2017-07-04 02:00:00  2618.785462  2582.002100  2615.00000  2607.00000
2 2017-07-04 06:00:00  2617.674406  2550.314214  2607.00000  2554.55010
3 2017-07-04 10:00:00  2572.962113  2519.967440  2554.55010  2519.96744
4 2017-07-04 14:00:00  2595.654622  2516.000000  2519.96744  2580.00000

Converting a Unix Timestamp to Datetime

You ran

from datetime import datetime

and expected to use datetime.datetime later as a function, but you actually imported the datetime class from the datetime module (which is why changing the import to just datetime didn't help). See this question for more: Converting unix timestamp string to readable date in Python

So you just need to import the datetime class and use its fromtimestamp function

Updating a Column in a DataFrame

For this, a common approach is to apply a function (in this case, fromtimestamp) to the column (a Pandas Series) and then over-write the old column in the DataFrame.

I am not aware of a way to modify the column in place, but would be happy to hear a way if I missed it in the Pandas docs.

Two lines..absolutely brilliant! Thank you so much! Gotta love python and the great community it has! — Jo3Dirt, Jul 06 '17 at 04:14

score 1 · Answer 2 · answered Jul 06 '17 at 04:53

Try this:

In [2]: df
Out[2]:
         date         high          low        open       close
0  1499140800  2615.390000  2600.000000  2610.58000  2603.00000
1  1499155200  2618.785462  2582.002100  2615.00000  2607.00000
2  1499169600  2617.674406  2550.314214  2607.00000  2554.55010
3  1499184000  2572.962113  2519.967440  2554.55010  2519.96744
4  1499198400  2595.654622  2516.000000  2519.96744  2580.00000

In [3]: df['date'] = pd.to_datetime(df['date'], unit='s')

In [4]: df
Out[4]:
                 date         high          low        open       close
0 2017-07-04 04:00:00  2615.390000  2600.000000  2610.58000  2603.00000
1 2017-07-04 08:00:00  2618.785462  2582.002100  2615.00000  2607.00000
2 2017-07-04 12:00:00  2617.674406  2550.314214  2607.00000  2554.55010
3 2017-07-04 16:00:00  2572.962113  2519.967440  2554.55010  2519.96744
4 2017-07-04 20:00:00  2595.654622  2516.000000  2519.96744  2580.00000

score 0 · Answer 3 · answered Jul 06 '17 at 04:23

0

As @Vinicius Aguiar said in the comment, you can also use pd.to_datetime().

df['date'] = pd.to_datetime(df['date']).dt.strftime('%Y-%m-%d %H:%M:%S')

answered Jul 06 '17 at 04:23

Rod Xavier

3,983
1
29
41

Replacing a pandas dataframe column

3 Answers3

Converting a Unix Timestamp to Datetime

Updating a Column in a DataFrame