1

I need/want to work with float indices in pandas but I get a keyerror when running something like this:

inds = [1.1, 2.2]
cols = [5.4, 6.7]
df = pd.DataFrame(np.random.randn(2, 2), index=inds, columns=cols)
df[df.index[0]]

I have seen some errors regarding precision, but shouldn't this work?

j_bio
  • 25
  • 5

1 Answers1

1

You get the KeyError because df[df.index[0]] would try to access a column with label 1.1 in this case - which does not exist here.

What you can do is use loc or iloc to access rows based on indices:

import numpy as np
import pandas as pd

inds = [1.1, 2.2]
cols = [5.4, 6.7]
df = pd.DataFrame(np.random.randn(2, 2), index=inds, columns=cols)

# to access e.g. the first row use
df.loc[df.index[0]]
# or more general
df.iloc[0]

# 5.4    1.531411
# 6.7   -0.341232
# Name: 1.1, dtype: float64

In principle, if you can, avoid equal comparisons for floating point numbers for the reason you already came across: precision. The 1.1 displayed to you might be != 1.1 for the computer - simply because that would theoretically require infinite precision. Most of the time, it will work though because certain tolerance checks will kick in; for example if the difference of the compared numbers is < 10^6.

FObersteiner
  • 22,500
  • 8
  • 42
  • 72
  • Thanks so much for the clear explanation. I see now it is a silly mistake but, being new to pandas, I couldn't see it! – j_bio Dec 12 '19 at 13:17
  • @j_bio: glad I could help! And you *have* to do float comparisons, besides the SO post I linked there's also [numpy.isclose](https://docs.scipy.org/doc/numpy/reference/generated/numpy.isclose.html) - just in case ;-) – FObersteiner Dec 12 '19 at 13:21