3

I am just studying Python Pandas Data Frame and I saw

%timeit

then I compare a few Dataframe, Below is an example of performances for different ways of accessing data frames that are highly relevant when the datasets become large.

so

%timeit data.ix[0,0]

10000 loops, best of 3: 159 µs per loop

%timeit data.loc[0,'nation']

10000 loops, best of 3: 158 µs per loop

%timeit data.iloc[0,0]

10000 loops, best of 3: 132 µs per loop

%timeit data.iat[0,0]

100000 loops, best of 3: 5.9 µs per loop

and you can see data.iat[0,0] hugely different from others.

My question is why .iat different than others and how is working? Can we work with any data?

Axis
  • 2,066
  • 2
  • 21
  • 40
  • 1
    Possible [duplicate](http://stackoverflow.com/questions/28757389/loc-vs-iloc-vs-ix-vs-at-vs-iat). – Psidom Mar 07 '17 at 03:00

1 Answers1

2

Firstly, don't use ix... it's use cases are more confusing than iloc/loc or iat/at. And ix will be deprecated

Second, get_value is way quicker but isn't intended to be public API, though nothing stops you from using it. See @jeff's comment


Now the meat of the answer:

iloc and loc accept array-like input... iat and at do not. So if you are accessing a single point in the dataframe, by all means, use iat and at. However, if you are looking to use boolean arrays or an array of positions or index values, you can't use iat or at so use iloc and loc

Community
  • 1
  • 1
piRSquared
  • 285,575
  • 57
  • 475
  • 624