-1

Consider the following df:

d = {'.KS200': {datetime.date(2016, 10, 3): nan, datetime.date(2016, 10, 4): 259.18, datetime.date(2016, 10, 5): 258.99, datetime.date(2016, 10, 6): 261.13, datetime.date(2016, 10, 7): 260.06}, '0001.HK': {datetime.date(2016, 10, 3): 99.45, datetime.date(2016, 10, 4): 99.45, datetime.date(2016, 10, 5): 99.25, datetime.date(2016, 10, 6): 98.7, datetime.date(2016, 10, 7): 98.0}}

df = pd.DataFrame.from_dict(d)
print(df)

            .KS200  0001.HK
2016-10-03    7.00    99.45
2016-10-04  259.18    99.45
2016-10-05  258.99    99.25
2016-10-06  261.13    98.70
2016-10-07  260.06    98.00

Now if I try to index:

df.loc['2016-10-03']

I get a raise KeyError(key) from err

Oddly enough when I print:

print(df.index) I get dtype=object instead of datetime so is it a datetime or is it an object my index ???

moth
  • 1,833
  • 12
  • 29
  • 2
    It's not a string: it's a date. Use a datetime object to index it. You explicitly set the values to datetime dates in your first code line even. – 9769953 Nov 16 '21 at 17:38
  • I am printing a portion of a large dataset using `to_dict()` , and it prints `datetime` but I never specify before in my code, whenever a print `df.index` I get `object` as `dtype` . That's the confusion – moth Nov 16 '21 at 17:44
  • @9769953 if it's a `datetime` why `print(df.index)` is equal to `object` ?????? – moth Nov 16 '21 at 17:50
  • Because the index can contain any type of object - it just so happens that all objects are of the same type in the dataframe you created. – gshpychka Nov 16 '21 at 18:13
  • The reason for the dtype being `object` is probably the underlying use of NumPy. It should indeed be `datetime.date` (or perhaps a Pandas datetime object). NumPy, however, doesn't know datetimes as standard types. If you look at `df.index.values`, you'll see it's a NumPy array, and `df.index.values.dtype` is also `object`. Possibly a shortcoming of Pandas (might be noted in an issue somewhere even, if you search around). – 9769953 Nov 16 '21 at 19:58

1 Answers1

1

Your DataFrame doesn't have a string index - it has a datetime.date index.

try

df.loc[datetime.date(2016, 10, 3)]

Better yet, create a datetime index:

df.index = pd.to_datetime(df.index)

Then, your original indexing will work.

gshpychka
  • 8,523
  • 1
  • 11
  • 31
  • why `print(df.index)` is equal to object then `Index([2016-10-03, 2016-10-04, 2016-10-05, 2016-10-06, 2016-10-07], dtype='object')` . Shouldn't be `datetime` – moth Nov 16 '21 at 17:40
  • Because you didn't specify a type for the index. Updated my answer for a better solution. – gshpychka Nov 16 '21 at 17:40
  • First you say that my data frame has a datetime index, then you tell me to assign a datetime to my index. Isn't the first statement contradicting the second ??? – moth Nov 16 '21 at 17:55
  • No. I did not say your dataframe has a datetime index - I said it has a `datetime.date` index. To convert it to a pandas.DateTimeIndex, use `pd.to_datetime` as specified in the answer. Did the solution work? – gshpychka Nov 16 '21 at 18:01
  • A more precise way to put it is that your dataframe has an index with type `object` that happens to contain `datetime.date` objects. – gshpychka Nov 16 '21 at 18:03
  • hum ok so how can I check the type of objects and index has please ? – moth Nov 16 '21 at 18:05
  • You can iterate through the rows and see what types your index contains. In your case, all index objects are of type `datetime.date`, but Pandas doesn't enforce this - you could add a row with a string index. Since you can have different types in the index (you didn't specify a type at creation). – gshpychka Nov 16 '21 at 18:07