Grouping by multiple columns with missing data:
data = [['Falcon', 'Captive', 390], ['Falcon', None, 350],
['Parrot', 'Captive', 30], ['Parrot', 'Wild', 20]]
df = pd.DataFrame(data, columns = ['Animal', 'Type', 'Max Speed'])
I understand how missing data are dealt with when grouping by individual columns (groupby columns with NaN (missing) values), but do not understand the behaviour when grouping by two columns.
It seems I cannot loop over all groups even though they seem to identified:
groupeddf = df.groupby(['Animal', 'Type'])
counter = 0
for group in groupeddf:
counter = counter + 1
print(counter)
len(groupeddf.groups)
results in 3 and 4 which is not consistent.
Pandas version 1.0.3