0

I am using Python Pandas to analyze my data.

My data looks like this:

simplified data

This dataframe includes a column with list of students and a column with list of dictionaries of exam score.

My idea is to make it like this and analyze it:

goal

First of all, I would like to know if my approach is okay. There are multiple duplicated values in the first two columns, looking less efficient, but don't know what other way can be better.

Secondly, my code makes lots of NaN values, and would like to ask if there is a way to make the result like the above. It does not seem like any join, concat, or merge can do anything, so need help.

student = df['student'].apply(pd.Series).T
df = df.drop(columns=['student'])
df = pd.concat([metadata, participants], axis=1)

wrong result

Also, when I try apply(pd.Series) as other answers say, I get value error: ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

hyeppy
  • 79
  • 1
  • 7
  • `out = df.explode(['Student', 'Score']) ; out = out.join(pd.json_normalize(out.pop('Score'])))` – mozway May 25 '23 at 09:47
  • Also, please do not use images of data, provide [reproducible text](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples)! – mozway May 25 '23 at 09:48

0 Answers0