5

I have a string that is the full year followed by the ISO week of the year (so some years have 53 weeks, because the week counting starts at the first full week of the year). I want to convert it to a datetime object using pandas.to_datetime(). So I do:

pandas.to_datetime('201145', format='%Y%W')

and it returns:

Timestamp('2011-01-01 00:00:00')

which is not right. Or if I try:

pandas.to_datetime('201145', format='%Y%V')

it tells me that %V is a bad directive.

What am I doing wrong?

user1566200
  • 1,826
  • 4
  • 27
  • 47
  • 1
    I think this may be a bug. https://docs.python.org/2/library/datetime.html#strftime-and-strptime-behavior prehaps worth reporting on [github](https://github.com/pydata/pandas/issues/) (perhaps it is [this one](https://github.com/pydata/pandas/issues/10315), but I think it's different) – Andy Hayden Jul 26 '16 at 21:47

1 Answers1

2

I think that the following question would be useful to you: Reversing date.isocalender()

Using the functions provided in that question this is how I would proceed:

import datetime
import pandas as pd
def iso_year_start(iso_year):
    "The gregorian calendar date of the first day of the given ISO year"
    fourth_jan = datetime.date(iso_year, 1, 4)
    delta = datetime.timedelta(fourth_jan.isoweekday()-1)
    return fourth_jan - delta 

def iso_to_gregorian(iso_year, iso_week, iso_day):
    "Gregorian calendar date for the given ISO year, week and day"
    year_start = iso_year_start(iso_year)
    return year_start + datetime.timedelta(days=iso_day-1, weeks=iso_week-1)

def time_stamp(yourString):
    year = int(yourString[0:4])
    week = int(yourString[-2:])
    day = 1
    return year, week, day

yourTimeStamp = iso_to_gregorian( time_stamp('201145')[0] , time_stamp('201145')[1], time_stamp('201145')[2] )

print yourTimeStamp

Then run that function for your values and append them as date time objects to the dataframe.

The result I got from your specified string was:

2011-11-07
Community
  • 1
  • 1
sTr8_Struggin
  • 665
  • 2
  • 11
  • 27
  • I was literally just joking about having to parse the string into the first four and last two characters if I couldn't figure this out. Guess it turns out that actually is the answer. Thanks! – user1566200 Jul 26 '16 at 22:07
  • So I tried this on a very large DataFrame, and it's super slow - about 60k rows take ~3 minutes: `time_convert_func = lambda x: iso_to_gregorian( time_stamp(x)[0] , time_stamp(x)[1], time_stamp(x)[2] )` followed by `result = df['startdate'].astype(str).apply(time_convert_func)` Any suggestions? – user1566200 Jul 27 '16 at 00:35