Why is .loc
only returning a single row where multiple rows have the same MultiIndex
?
Given the following dataframe
col0 col1 col2
idx0 idx1
0 0 1.0 example1 1.0
0 4.0 example2 8.0
1 9.0 example3 27.0
1 16.0 example4 64.0
1 0 0.5 example1 0.5
0 2.0 example2 4.0
1 4.5 example3 13.5
1 8.0 example4 32.0
the .xs
operation will select
In [121]: df.xs((0,1), level=[0,1])
Out[121]:
col0 col1 col2
idx0 idx1
0 1 9.0 example3 27.0
1 16.0 example4 64.0
whilst the .loc
operation will select
In [125]: df.loc[[(0,1)]]
Out[125]:
col0 col1 col2
idx0 idx1
0 1 16.0 example4 64.0
This is highlighted even further by the following
In [149]: df.loc[pd.IndexSlice[:, 1], :]
Out[149]:
col0 col1 col2
idx0 idx1
0 1 9.0 example3 27.0
1 16.0 example4 64.0
In [150]: df.loc[pd.IndexSlice[0, 1], :]
Out[150]:
col0 16
col1 example4
col2 64
Name: (0, 1), dtype: object
Set Up
import pandas as pd
import numpy as np
idx0 = range(2)
idx1 = np.repeat(range(2), 2)
midx = pd.MultiIndex(
levels=[idx0, idx1],
labels=[
np.repeat(range(len(idx0)), len(idx1)),
np.tile(range(len(idx1)), len(idx0))
],
names=['idx0', 'idx1']
)
df = pd.DataFrame(
[
[i**2/float(j), 'example{}'.format(i), i**3/float(j)]
for j in range(1, len(idx0) + 1)
for i in range(1, len(idx1) + 1)
],
columns=['col0', 'col1', 'col2'],
index=midx
)