I have a dataframe with several categorical columns, and I want to aggregate all these into a single categorical column, preferably using Pandas.
For an example, if I have two columns, named category1 (c1) and category2 (c2), both with data that range from 0 to 2, I want to aggregate them in some other column category (c), which can range from 0 to 5, representing all the possible categorical values combinations.
I would go from this:
d1 d2 c1 c2
1 1 NA 0
2 1 1 1
3 1 0 2
4 2 2 NA
5 1 NA NA
6 2 2 2
7 2 0 NA
8 2 0 2
To this:
d1 d2 c
1 1 0
2 1 1
3 1 2
4 2 3
5 1 4
6 2 5
7 2 6
8 2 2
I tried following this, but it didn't seem to work and threw some errors, namely ValueError: cannot reindex from a duplicate axis.
I appreciate in advance any help.