Pandas Dataframe need the mean of a subset of a column based on other columns

Question

I have a pandas dataframe like this:Dataframe example, but with several thousand rows.

I need to get the mean of all of the students in each class based on the year and separated by score. ie. The averages for the photography class in 2015 would be 79.5 and 83.5 in the example in the picture.

I've been able to get the data to filter by the class column using

byClass = data[data['Class'].str.contains("Photography")==True]

and I was able to get all the means from there using

byClass= byClass.mean()

I tried adding a second parameter for the year like this:

byClass = data[data['Class'].str.contains("Photography")==True,data['Year']==2015]

But haven't been able to get it to work. I have tried putting the 2015 in "" and have tried searching for it using str.contains but the dataframe has it identified as in int64 so the str.contains fails because of the data type.

Thanks for the quick response. However, I'm not able to get this to work I get an error that says # Add key to exclusions Key error: 'Class'. The reason I was trying to split the data to individual sets is that I'm going to need to store each group result in a dictionary formated as {Class: Photography, Year: 2015, Avg Score 1: Score, Avg Score 2: Score} is that something that can be done iwth iloc? — xldncr, Jan 03 '20 at 01:17

Pandas Dataframe need the mean of a subset of a column based on other columns

0 Answers0