Scenario:
If I have this table, let's call it df
:
survey_answer_1___1 | survey_answer_1___2 | survey_answer_1___3 | survey_answer_2___1 | survey_answer_2___2 |
---|---|---|---|---|
1 | 1 | 0 | 1 | 0 |
0 | 1 | 0 | 0 | 0 |
0 | 0 | 0 | 1 | 0 |
1 | 1 | 1 | 0 | 0 |
Using R or Python, how do I split and transform df
into survey_answer_1
and survey_answer_2
like this:
survey_answer_1
:
1 | 2 | 3 |
---|---|---|
2 | 3 | 1 |
survey_answer_2
:
1 | 2 |
---|---|
2 | 0 |
Where the column names of the new tables are extracted from df
column names after '___'
. The values in the new cells is the count of 1s in each column in df
. This should be done automatically (tables should not be "hard-coded"), as there are many other columns in my data file that this should be applied on as well.
split()
can be used to extract the numbers after '___'
for column names. I tried implementing the rest using a dictionary, but it is not working.