I'm trying to use Pandas to read an excel file from a survey result sheet (on the rows there are the participants), but I got many variables split into multiple columns, like this
>>> df.columns
Index([ ... , 'Age', 'Unnamed: 12', 'Unnamed: 13', 'Unnamed: 14', 'Unnamed: 15', 'Unnamed: 16', ...], dtype='object', length=256)
where each unnamed column after 'Age'
and the before the next named column contains only the values of the Age variable corresponding to a single choice from that multiple choice question.
How do I get all the Age values under the same column?
Edit: example of the output of df.head(5).to_dict()
:
{...,
'Gender': {0: 'M', 1: 'M', 2: 'M', 3: nan, 4: nan},
'Unnamed: 10': {0: 'F', 1: nan, 2: nan, 3: 'F', 4: 'F'},
'Age': {0: 25.0, 1: nan, 2: 25.0, 3: nan, 4: nan},
'Unnamed: 12': {0: 26.0, 1: nan, 2: nan, 3: 26.0, 4: nan},
'Unnamed: 13': {0: 27.0, 1: nan, 2: nan, 3: nan, 4: nan},
'Unnamed: 14': {0: 28.0, 1: nan, 2: nan, 3: nan, 4: 28.0},
'Unnamed: 15': {0: 29.0, 1: nan, 2: nan, 3: nan, 4: nan},
'Unnamed: 16': {0: 30.0, 1: nan, 2: nan, 3: nan, 4: nan},
...}