2

I have a pivot table created from pandas (DataFrame object).

Currently, I have multiple indexes and I want to be able to filter through some of them. To clarify this is how the pivot tables looks like.

enter image description here

There are 1000s of those. What I want to be able to do is get all the instances where opt_step is 1, cf is pv_area_cost1a and when the optimization is MANHATTAN. The end goal is to be able to graph iterations vs cost for some instances.

The pivot table is called pt. I've tried pt.index.get_level_values['pv_area_cost1a'] and I have read the entire page in advanced indexing to no avail.

Arnaud
  • 7,259
  • 10
  • 50
  • 71
synkilla12
  • 43
  • 5
  • Isnt' 'get_level_values' a function? Also the the [docs](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Index.get_level_values.html) say it should be an int, but indeed I seem to remember using it with column labels – vmg Oct 07 '15 at 22:45
  • I'm sorry, the get_level_values function returns all the values of a specific index. For example, cf has pv_area_cost1a, full_cost, and several more cfs. Some of the example on the docs says you should be able to retrieve values for a specific index with pt['pv_area_cost1a'] but it does not. – synkilla12 Oct 07 '15 at 22:51
  • I think [this](http://stackoverflow.com/questions/28002197/pandas-proper-way-to-set-values-based-on-condition-for-subset-of-multiindex-da) question can help you – vmg Oct 07 '15 at 22:57

1 Answers1

1

First, create a multi-indexed dataframe:

df = pd.DataFrame({'i1': [1, 1, 1, 1], 'i2': [2, 2, 3, 3], 'i3': [4, 5, 4, 5], 'v1': [10] * 4, 'v2': [20] * 4}).set_index(['i1', 'i2', 'i3'])
>>> df
          v1  v2
i1 i2 i3        
1  2  4   10  20
      5   10  20
   3  4   10  20
      5   10  20

For me, the easiest way to slice this type of dataframe is to use a combination of .loc and IndexSlice. So, to slice the above df where i2=3 and i3=5:

>>> df.loc[pd.IndexSlice[:, 3, 5], :]

          v1  v2
i1 i2 i3        
1  3  5   10  20

The : inside IndexSlice signifies to select all rows of i1. The very last : inside the loc function signifies to select all columns in the dataframe (v1 and v2).

Alexander
  • 105,104
  • 32
  • 201
  • 196