1

There are multiple questions on stack overflow comparing loc, iloc, and ix such as this one, and multiple questions talking about speed differences such as this one. It seems that the consensus is that .ix is faster, but it is deprecated.

This leads me to my question, if .ix is so much faster, especially in label based indexing, why deprecate it? Why would you not want to use the faster method? The only reason I have found for deprecating .ix is that it confused people since it worked for both labels and integers. Am I missing something? Or is the only downside to .ix that it is confusing and so may not be supported in the future?

Also, side question about implementation of the three methods. How is it that .ix is faster and less specific. This seems counter-intuitive to me. I would expect the more general a method gets the slower it would be. Why not write loc and iloc to be faster like .ix?

cs95
  • 379,657
  • 97
  • 704
  • 746
noah
  • 2,616
  • 13
  • 27
  • In that link they are comparing for indexing *a sinlge element* in the dataframe, for which you sould use `.at` or `.iat`. try slicing something larger to compare the relvant performance – juanpa.arrivillaga Jun 03 '19 at 17:15
  • .ix was confusing. Let's say you have a dataframe with in index of range(2, 12). If you used .loc[2], you would get the row with the 2 label. If you used .iloc[2], you would get the row at position 2 which in this datafram would be row labelled 4. Now, if you used the deprecated function .ix, it is ambigious. – Scott Boston Jun 03 '19 at 17:18

1 Answers1

1

ix has to make assumptions as to what the labels mean. This is not intuitive behaviour, and may lead to serious breakage on corner cases (such as when your column labels are integers themselves). With loc, you're only passing labels. With iloc, you're only passing integer position indexes. The input is obvious and the output is as well.

Now, the speed differences mentioned are of the order of milliseconds or microseconds which is a "seriously, don't worry about it™" kind of difference. I consider that a worthy tradeoff for a more consistent, robust API. 'Nuff said.

cs95
  • 379,657
  • 97
  • 704
  • 746