2

I want to convert two timestamp columns start_date and end_date to normal date columns:

   id     start_date       end_date
0   1  1578448800000  1583632800000
1   2  1582164000000  1582250400000
2   3  1582509600000  1582596000000
3   4  1583373600000  1588557600000
4   5  1582509600000  1582596000000
5   6  1582164000000  1582250400000
6   7  1581040800000  1586224800000
7   8  1582423200000  1582509600000
8   9  1583287200000  1583373600000

The following code works for one timestamp, but how could I apply it to those two columns? Thanks for your kind helps.

import datetime
timestamp = datetime.datetime.fromtimestamp(1500000000)
print(timestamp.strftime('%Y-%m-%d %H:%M:%S'))

Output:

2017-07-14 10:40:00

I also try with pd.to_datetime(df['start_date']/1000).apply(lambda x: x.date()) which give a incorrect result.

0    1970-01-01
1    1970-01-01
2    1970-01-01
3    1970-01-01
4    1970-01-01
5    1970-01-01
6    1970-01-01
7    1970-01-01
8    1970-01-01
ah bon
  • 9,293
  • 12
  • 65
  • 148

1 Answers1

3

Use DataFrame.apply with list of columns names and to_datetime with parameter unit='ms':

cols = ['start_date', 'end_date']
df[cols] = df[cols].apply(pd.to_datetime, unit='ms')
print (df)
   id          start_date            end_date
0   1 2020-01-08 02:00:00 2020-03-08 02:00:00
1   2 2020-02-20 02:00:00 2020-02-21 02:00:00
2   3 2020-02-24 02:00:00 2020-02-25 02:00:00
3   4 2020-03-05 02:00:00 2020-05-04 02:00:00
4   5 2020-02-24 02:00:00 2020-02-25 02:00:00
5   6 2020-02-20 02:00:00 2020-02-21 02:00:00
6   7 2020-02-07 02:00:00 2020-04-07 02:00:00
7   8 2020-02-23 02:00:00 2020-02-24 02:00:00
8   9 2020-03-04 02:00:00 2020-03-05 02:00:00

EDIT: For dates add lambda function with Series.dt.date:

cols = ['start_date', 'end_date']
df[cols] = df[cols].apply(lambda x: pd.to_datetime(x, unit='ms').dt.date)
print (df)
   id  start_date    end_date
0   1  2020-01-08  2020-03-08
1   2  2020-02-20  2020-02-21
2   3  2020-02-24  2020-02-25
3   4  2020-03-05  2020-05-04
4   5  2020-02-24  2020-02-25
5   6  2020-02-20  2020-02-21
6   7  2020-02-07  2020-04-07
7   8  2020-02-23  2020-02-24
8   9  2020-03-04  2020-03-05

Or convert each column separately:

df['start_date'] = pd.to_datetime(df['start_date'], unit='ms')
df['end_date'] = pd.to_datetime(df['end_date'], unit='ms')
print (df)
   id          start_date            end_date
0   1 2020-01-08 02:00:00 2020-03-08 02:00:00
1   2 2020-02-20 02:00:00 2020-02-21 02:00:00
2   3 2020-02-24 02:00:00 2020-02-25 02:00:00
3   4 2020-03-05 02:00:00 2020-05-04 02:00:00
4   5 2020-02-24 02:00:00 2020-02-25 02:00:00
5   6 2020-02-20 02:00:00 2020-02-21 02:00:00
6   7 2020-02-07 02:00:00 2020-04-07 02:00:00
7   8 2020-02-23 02:00:00 2020-02-24 02:00:00
8   9 2020-03-04 02:00:00 2020-03-05 02:00:00

And for dates:

df['start_date'] = pd.to_datetime(df['start_date'], unit='ms').dt.date
df['end_date'] = pd.to_datetime(df['end_date'], unit='ms').dt.date
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252