5

I'm importing some data in a spreadsheet. It's in a dataframe, but the date is in a numerical representation or format

41516.43

First, I want to convert it to a date and time object

date_val = 30/08/2013  10:13:26 AM

Then I would like to split the date_val into date and time separately, and place them in seperate columns in my dataframe (df)

| **original date** | **date**     | **time** | 
41516.43              30/08/2013      10:13:26 AM
yoshiserry
  • 20,175
  • 35
  • 77
  • 104
  • what is this numerical format? – Andy Hayden Mar 14 '14 at 02:53
  • The date (as a short date) in excel, but then I removed all formatting. from the whole spreadsheet, making it much easier to read. So in short an excel date format? – yoshiserry Mar 14 '14 at 03:02
  • This page says DATE time Serial Number is the format.. no more detail than that. http://office.microsoft.com/en-au/excel-help/available-number-formats-HA102749055.aspx – yoshiserry Mar 14 '14 at 03:10
  • How are you getting the csv, have you tried using to_excel? – Andy Hayden Mar 14 '14 at 03:49
  • @AndyHayden I did use the To Excel > And I get >> 41516.43. I chose not to parse dates on import because I usually import the data as is, spend time looking at it to see what I can do. I don't usually have the luxury of knowing how many columns I will have and the format's of those columns. – yoshiserry Mar 14 '14 at 04:00

1 Answers1

6

Piecing together from another question:

In [11]: s = pd.Series([41516.43])

In [12]: from xlrd.xldate import xldate_as_tuple

In [13]: from datetime import datetime

In [14]: s.apply(lambda x: datetime(*xldate_as_tuple(x, 0)))
Out[14]:
0   2013-08-30 10:19:12
dtype: datetime64[ns]

Note: presumably slight difference is due to rounding of the float you gave.

and the "bare-knuckle no-seat-belts use-at-own-risk" version:

In [21]: pd.Timestamp('1899-12-30') + (pd.offsets.Day(1).nanos * s).astype(np.timedelta64)
Out[21]:
0   2013-08-30 10:19:12
dtype: datetime64[ns]

I think it's generally preferable to do parse dates while using read_excel.

Community
  • 1
  • 1
Andy Hayden
  • 359,921
  • 101
  • 625
  • 535
  • Thanks for that. Would you mind trying to explain what is LAMDA, and how it works. I keep seeing it around, but have no idea what it's doing. Is it essentially making 3 lines of code into 1? Can you still write your lamda statement another way without lamda? – yoshiserry Mar 16 '14 at 11:25
  • Also @ Andy Hayden, why do you have to convert the floating point number to a series first?\ – yoshiserry Mar 16 '14 at 11:29
  • @yoshiserry a column in a DataFrame is a Series. Lambda if an inline way of making a function. – Andy Hayden Mar 16 '14 at 16:30
  • could you do it with a normal function, and not lamda, I.e. are there ways of doing it without using lamda? – yoshiserry Mar 17 '14 at 03:12
  • @yoshiserry yes, a lambda is the same as a function, it's just inline. So you can do `def f(x): return datetime(*xldate_as_tuple(x, 0))`. – Andy Hayden Mar 17 '14 at 05:56