I have a multilabel classification problem.
I would like to delete rows thave a value (0) in all of the 35 columns of the dataframe, except ['Doc'] column.
Example of dataframe
Doc Big Small Int Bor Drama
j2 0 0 0 0 0
i9 1 0 1 1 0
ui8 0 0 0 1 0
po4 0 1 0 0 0
po9 0 0 0 0 0
Here's the expected outcome
Doc Big Small Int Bor Drama
i9 1 0 1 1 0
ui8 0 0 0 1 0
po4 0 1 0 0 0
These are the rows I would like to delete:
j2 0 0 0 0 0
po9 0 0 0 0 0
Here's how I count them:
rowSums = df.iloc[:,2:].sum(axis=1)
no_labelled = (rowSums==0).sum(axis=0)
print("no.docs with no label =", no_labelled)
no.docs with no label = 60
How can I delete these 60 rows from the dataframe?
Thanks