I am trying this problem but not getting the right solution.
So, I have a data which has City and Months mapped to them
City | Month |
---|---|
A | M1 |
A | M2 |
B | M3 |
B | M4 |
C | M5 |
C | M8 |
I have created dummy variables and have marked them as binary in this manner
City | M1 | M2 | M3 | M4 | M5 | M8 |
---|---|---|---|---|---|---|
A | 1 | 0 | 0 | 0 | 0 | 0 |
A | 0 | 1 | 0 | 0 | 0 | 0 |
B | 0 | 0 | 1 | 0 | 0 | 0 |
B | 0 | 0 | 0 | 0 | 0 | 1 |
C | 0 | 0 | 0 | 1 | 0 | 0 |
C | 0 | 0 | 0 | 0 | 1 | 0 |
Now, the main problem is, I want to mark each location to a month in a single row, like this
City | M1 | M2 | M3 | M4 | M5 | M8 |
---|---|---|---|---|---|---|
A | 1 | 1 | 0 | 0 | 0 | 0 |
B | 0 | 0 | 1 | 0 | 0 | 1 |
C | 0 | 0 | 0 | 1 | 1 | 0 |
Can anyone suggest how to move from table 2 to table 3 structure? I do not want to hard code them as different locations might get assigned random months in subsequent data. Getting dummy variables in easy but how do I get to last format? any useful functions existing in python for this?