I'm looking to see if it is even possible to do using python. I have more than 100 subjects, several variables extracted on different days; I want to make the variable names (written in rows) into column names and move its values accordingly. In case it doesn't make sense, I'm attaching a screenshot of what I want to do using Python. Please let me know if it is even possible, and if so, how? I'm not necessarily looking for specific script, but just the general flow? Thanks!
Asked
Active
Viewed 179 times
-1
-
Hi hek999, welcome to SO! Here are some tips to get the best out of this site Firstly, it's best to search for solutions before you post a question. E.g. [this](https://stackoverflow.com/questions/37150248/pandas-keyerror-using-pivot) and [this](https://stackoverflow.com/questions/47152691/how-to-pivot-a-dataframe). Also, [please don't post pictures of code](https://meta.stackoverflow.com/a/285557/9576876) (or data). It's better to include a little of both as text. Lastly, something like this will do the trick `df.pivot_table(index='subjID', columns='var_names', values='var_values')`. – Chris Apr 03 '20 at 07:49
1 Answers
0
Try the following code:
df.groupby('subjID').apply(lambda grp: grp.pivot(
index='date', columns='var_names', values='var_values'))\
.rename_axis(columns=None).reset_index()
Initially I thought about just pivot with index set to subjID and date, but unfortunately pivot does not serve such a case.
I found that this is a bug and has beed reported on Pandas GitHub as issue 21425.
So to circumvent this limitation, I grouped by subjID and then called pivot with index='date' on each group.
To see the need for rename_axis and reset_index, run my code without these fuctions and the difference should be obvious.

Valdi_Bo
- 30,023
- 4
- 23
- 41
-
-
1
-
*pivot_table* can be used, e.g. with *aggfunc=np.sum*, but it is designed rather for cases where there are **multiple** rows for a single index value (so some aggregation is needed). But in this case there is a **sigle** value for each index, so *pivot* seems a more natural choice. – Valdi_Bo Apr 03 '20 at 08:56