1

I have a datetime issue where I am trying to match up a dataframe with dates as index values.

For example, I have dr which is an array of numpy.datetime.

dr = [numpy.datetime64('2014-10-31T00:00:00.000000000'),
      numpy.datetime64('2014-11-30T00:00:00.000000000'),
      numpy.datetime64('2014-12-31T00:00:00.000000000'),
      numpy.datetime64('2015-01-31T00:00:00.000000000'),
      numpy.datetime64('2015-02-28T00:00:00.000000000'),
      numpy.datetime64('2015-03-31T00:00:00.000000000')]

Then I have dataframe with returndf with dates as index values

print(returndf) 
             1    2    3    4    5    6    7    8    9    10
10/31/2014  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN
11/30/2014  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN

Please ignore the missing values

Whenever I try to match date in dr and dataframe returndf, using the following code for just 1 month returndf.loc[str(dr[1])],
I get an error

KeyError: 'the label [2014-11-30T00:00:00.000000000] is not in the [index]'

I would appreciate if someone can help with me on how to convert numpy.datetime64('2014-10-31T00:00:00.000000000') into 10/31/2014 so that I can match it to the data frame index value.

Thank you,

David Leon
  • 1,017
  • 8
  • 25
f1racer
  • 45
  • 5
  • Possible duplicate of [Convert numpy.datetime64 to string object in python](https://stackoverflow.com/questions/19502506/convert-numpy-datetime64-to-string-object-in-python) – Matts Feb 21 '18 at 03:16

1 Answers1

0
  1. Your index for returndf is not a DatetimeIndex. Make is so:

    returndf = returndf.set_index(pd.to_datetime(returndf.index))
    
  2. Your dr is a list of Numpy datetime64 objects. That bothers me:

    dr = pd.to_datetime(dr)
    
  3. Your sample data clearly shows that the index of returndf does not include all the items in dr. In that case, use reindex

    returndf.reindex(dr)
    
                 1   2   3   4   5   6   7   8   9  10
    2014-10-31 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
    2014-11-30 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
    2014-12-31 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
    2015-01-31 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
    2015-02-28 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
    2015-03-31 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
    
piRSquared
  • 285,575
  • 57
  • 475
  • 624
  • That was very help. But I am confused about datetime a bit. Here are the steps before the variable dr data_date = pd.to_datetime(testdf["Date"], format ="%m/%d/%Y") df=sorted(data_date.unique()) data_date seems to be correct but why does using sorted and unique convert df into datetime64? – f1racer Feb 21 '18 at 14:16
  • You've got several things going on in there. `sorted` is a python function and returns a list. Also, you have `df = sorted(data_date.unique())`. The `unique()` method will return a numpy array. Not to mention don't even know what `data_date` is. That wasn't in your original question. Next: you don't call `pd.to_datetime` on a dataframe unless that dataframe has specifically named columns and if it does have those columns, you don't need to use the `format` argument. If you want more clarity, post a new question. Not enough room in comments. – piRSquared Feb 21 '18 at 14:27