I'm doing a little bit of math on some indices that I have saved in a CSV file, and I'm getting some behavior from .loc
that I can only describe as... strange. When I read this CSV file into a dataframe using Pandas, I see the following:
[1]: import pandas as pd
[2]: df = pd.read_csv(csv_path, parse_dates=True, index_col="Date")
[3]: df = df.apply(pd.to_numeric, errors='coerce') # shouldn't matter
[4]: df.head(5)
Date idx1 idx2 idx3 idx4 idx5
2019-03-22 106.1069 106.6425 106.520 106.45 105.870 ...
2019-03-21 106.6994 107.1746 106.975 106.87 106.145 ...
2019-03-20 106.4900 107.0894 106.875 106.84 106.095 ...
2019-03-19 106.4661 106.9107 106.820 106.71 106.100 ...
2019-03-18 106.5319 107.0137 106.760 106.75 106.100 ...
[5 rows x 53 columns]
When I print the index
and index.values
I also see the following:
[5]: print df.index
DatetimeIndex(['2019-03-22', '2019-03-21', '2019-03-20', '2019-03-19',
'2019-03-18', '2019-03-15', '2019-03-14', '2019-03-13',
'2019-03-12', '2019-03-11',
...
'2013-02-07', '2013-02-06', '2013-02-05', '2013-02-04',
'2013-02-01', '2013-01-31', '2013-01-30', '2013-01-29',
'2013-01-28', '2013-01-25'],
dtype='datetime64[ns]', name=u'Date', length=1539, freq=None)
[6]: print df.index.values
['2019-03-22T00:00:00.000000000' '2019-03-21T00:00:00.000000000'
'2019-03-20T00:00:00.000000000' ... '2013-01-29T00:00:00.000000000'
'2013-01-28T00:00:00.000000000' '2013-01-25T00:00:00.000000000']
Now here's where it gets weird. If I run the following:
[7]: df.loc["2019-03-21"]
Date idx1 idx2 idx3 idx4 idx5
2019-03-21 106.6994 107.1746 106.975 106.87 106.145
[1 rows x 53 columns]
I get what I expect which is the row corresponding to that date. However, when I run the same exact thing with:
[8]: print df.loc["2019-03-22"]
KeyError: 'the label [2019-03-22] is not in the [index]'
I get a KeyError saying this label is not in the index. I have gone to the actual CSV file to confirm that date is there and I've tried various other .loc
dates and have had success with all of them except for 2019-03-22
.
Can anyone give me a hint as to what might be going on here? I cannot for the life of me figure out what's going on.
In response to the question from Edeki Okoh below:
print df.index.get_loc("2019-03-22")
[0]
print df.index.get_loc("2019-03-21")
[1]
df.iloc[0]
Out[17]:
idx1 106.107
idx2 106.642
idx3 106.52
idx4 106.45
idx5 105.87
Name: 2019-03-22 00:00:00, dtype: object