I have the following dataframe df
, which has a bunch of observations that have a bunch of categories that they can belong to and whether they won or not. I subsetted some of the rows so there are actually more categories. I want to create a new dataframe where I remove any category columns (any column with prefix cat_
) that don't correspond to any of the observations (i.e. the whole category column is 0).
id cat_food cat_fitness cat_retail cat_grocery win
1 1 0 1 0 1
2 1 0 0 0 0
3 0 1 0 0 1
4 1 0 0 0 1
4 1 0 0 0 0
5 1 0 0 0 1
6 0 1 1 0 1
6 0 1 1 0 0
my expected dataframe would have the column cat_grocery
removed because none of the observations belong to that category
id cat_food cat_fitness cat_retail win
1 1 0 1 1
2 1 0 0 0
3 0 1 0 1
4 1 0 0 1
4 1 0 0 0
5 1 0 0 1
6 0 1 1 1
6 0 1 1 0