*updated original question
Example code:
import pandas as pd
df = pd.DataFrame({'Weight': [1.2, 2.0, 1.8,2.4,1.9,2.3],
'Sex': ['Male', 'Female', 'Unknown','Male','Male','Female'],
'Neutered': ['Entire', 'Unknown', 'Neutered','Neutered','Neutered','Unknown'],
'Rabbit_Breed': ['Dutch', 'Lop', 'Dwarf','Giant','Cross-Breed','Dwarf'],
'Abscess-mouth': [0, 0, 1,0,0,0],
'Overweight': [0, 1, 0,1,0,1],
'underweight': [0, 0, 1,0,0,1],
'molars-long': [1, 0, 1,0,0,1]})
df.head()
NB: I have around 100 columns so I cannot list them all; I'm looking for a way to groupby and or sum through all the columns to have the most common disorders in relation to the breed or sex of a rabbit.
I've attached an image of my thought process:
original question: I'm looking to groupby one or two columns and sum all the other columns. Not sure if I should use a range or what but I keep getting errors.
Unless I've misunderstood the purpose of groupby
and sum
. I've got about 100 columns of disorders in domestic rabbits and ultimately I'm trying to investigate the most common ones and plot them against breed or female/male etc.
Thank you!!