1

I get an unexpected error message when trying to slice two levels of a multiIndex...Any help on how to do it? Please look at the attached code...

I am running Python 3.7.1 and Pandas 0.23.4

I have this dataframe:

import pandas as pd
import itertools
index = list(itertools.product(['Ada','Quinn','Violet','Juan'],['Physics', 
    'Chemistry','Math','English']))
headr = list(itertools.product(['Exams','Labs', 'Particip'], 
    ['I','II','III','IV']))
indx = pd.MultiIndex.from_tuples(index,names=['Student','Course'])
cols = pd.MultiIndex.from_tuples(headr) #Notice these are un-named
data = [[70+x+y+(x*y)%3 for x in range(12)] for y in range(16)]
df = pd.DataFrame(data,indx,cols)
dfls=df.sort_index(level=0);dfls

As you can see below, I can slice into one level of the dataframe without problems:

dfls.loc[(('Ada','Quinn'),('Math','Chemistry')),('Labs',('I','IV'))]

getting:

                            Labs
                            I   IV
Student     Course      
Ada         Chemistry       76  79
            Math            78  81
Quinn       Chemistry       81  84
            Math            80  83   

But when I try with two different levels:

dfls.loc[(('Ada','Quinn'),('Math','Chemistry')),[('Exams',('I','III')), 
('Labs',('II','IV'))]]

I get the following error message:

ValueError: setting an array element with a sequence

How may I avoid this error message and get the result I am looking for? Thanking you in advance...

2 Answers2

4

You will need to pass a list of tuples for slicing on the columns.

idx = (('Ada','Quinn'),('Math','Chemistry'))
cols = [('Exams', 'I'), ('Exams', 'III'), ('Labs', 'II'), ('Labs', 'IV')] 
dfls.loc[idx, cols]

                  Exams     Labs    
                      I III   II  IV
Student Course                      
Ada     Chemistry    71  75   78  79
        Math         72  75   78  81
Quinn   Chemistry    75  78   81  84
        Math         76  78   81  83

The reason the labels for the index are simplified is because you are slicing the same sublevels for each level. It is not the case for the columns, so you will need to spell out each one separately.

You can read more about MultiIndex-based slicing at How do I slice or filter MutliIndex DataFrame levels?.

cs95
  • 379,657
  • 97
  • 704
  • 746
  • Looks like I will need to make an edit to my post to address this specific case... – cs95 Dec 27 '18 at 20:40
  • @JoséLuisMartínez Please consider marking the answer accepted if it solved your problem. TIA. – cs95 Dec 27 '18 at 20:42
  • But, if that`s the reason, how did I get a result when applied: dfls.loc[(('Ada','Quinn'),('Math','Chemistry')),('Labs',('I','IV'))]? – José Luis Martínez Dec 27 '18 at 20:43
  • @JoséLuisMartínez You were only slicing on one upper level, so it was easier for pandas to understand what you were trying to do. – cs95 Dec 27 '18 at 20:43
  • @JoséLuisMartínez ... if that answers your question. – cs95 Dec 27 '18 at 20:54
  • But @coldspeed, if that`s the reason: How did I get a result when applied: `dfls.loc[(('Ada','Quinn'),('Math','Chemistry')),('Labs',('I','IV'))]`? – José Luis Martínez Dec 27 '18 at 21:50
  • Don`t get me wrong, I thank you for the answer. But, Is it not possible to get just a one line answer then? The proposed solution **- seems to me -** cumbersome and too many lines – José Luis Martínez Dec 27 '18 at 22:06
0

Now I get it, thanks to @coldspeed answer, the one line answer I am looking for would be:

dfls.loc[(('Ada','Quinn'),('Math','Chemistry')), [('Exams', 'I'), ('Exams', 'III'), ('Labs', 'II'), ('Labs', 'IV')] ]