Let's say I have a pandas Series
, and I want to access a set of elements at specific indices, like so:
In [1]:
from pandas import Series
import numpy as np
s = Series(np.arange(0,10))
In [2]: s.loc[[3,7]]
Out[2]:
3 3
7 7
dtype: int64
The .loc
method accepts a list
as the parameter for this type of selection. The .iloc
and .ix
methods work the same way.
However, if I use a tuple
for the parameter, both .loc
and .iloc
fail:
In [5]: s.loc[(3,7)]
---------------------------------------------------------------------------
IndexingError Traceback (most recent call last)
........
IndexingError: Too many indexers
In [6]: s.iloc[(3,7)]
---------------------------------------------------------------------------
IndexingError Traceback (most recent call last)
........
IndexingError: Too many indexers
And .ix
produces a strange result:
In [7]: s.ix[(3,7)]
Out[7]: 3
Now, I get that you can't even do this with a raw python list
:
In [27]:
x = list(range(0,10))
x[(3,7)]
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-27-cefdde088328> in <module>()
1 x = list(range(0,10))
----> 2 x[(3,7)]
TypeError: list indices must be integers or slices, not tuple
To retrieve a set of specific indices from a list
, you need to use a comprehension, as explained here.
But on the other hand, using a tuple
to select rows from a pandas DataFrame
seems to work fine for all three indexing methods. Here's an example with the .loc
method:
In [8]:
from pandas import DataFrame
df = DataFrame({"x" : np.arange(0,10)})
In [9]:
df.loc[(3,7),"x"]
Out[9]:
3 3
7 7
Name: x, dtype: int64
My three questions are:
- Why won't the
Series
indexers accept atuple
? It would seem
natural to use atuple
since the set of desired indices is an
immutable, single-use parameter. Is this solely for the purpose of mimicking thelist
interface? - What is the explanation for the strange
Series
.ix
result? - Why the inconsistency between
Series
andDataFrame
on this matter?