3

I have a dataframe which looks like:

df = pd.DataFrame(
    {
        "id": [1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4],
        "mod": ["a", "a", "b", "b"] * 4,
        "qid": [11, 12, 13, 14] * 4,
        "ans": ["Z","Y","X","W","V","U","T","S","R","Q","P","O","N","M","L", "K"],
    }
)
df

    id  mod qid ans
0   1   a   11  Z
1   1   a   12  Y
2   1   b   13  X
3   1   b   14  W
4   2   a   11  V
5   2   a   12  U
6   2   b   13  T
7   2   b   14  S
8   3   a   11  R
9   3   a   12  Q
10  3   b   13  P
11  3   b   14  O
12  4   a   11  N
13  4   a   12  M
14  4   b   13  L
15  4   b   14  K

Each value of qid fits within mod entirely. E.g., qid = 11 only occurs in mod = a.

I'd like to reshape the data into wide format, with mod and qid as column levels:

         a       b
    11  12  13  14
id              
1   Z   Y   X   W
2   V   U   T   S
3   R   Q   P   O
4   N   M   L   K

Is this possible in Pandas? I've tried pivot() with no luck.

Taren Sanders
  • 421
  • 2
  • 12
  • 1
    See question 7. If you want a `MultiIndex` on the columns you should use `pivot_table`: `df.pivot_table(index='id', columns=['mod', 'qid'], values='ans', aggfunc='first')`. Another option is to set index then unstack, since this is just a reshape and not really an aggregation: `df.set_index(['id', 'mod', 'qid'])['ans'].unstack([-2,-1])` – ALollz Feb 07 '20 at 05:23

1 Answers1

1

Use pandas.pivot_table

pd.pivot_table(df, index='id', columns=['mod', 'qid'], aggfunc='first')

Output

    ans
mod a       b
qid 11  12  13  14
id              
1   Z   Y   X   W
2   V   U   T   S
3   R   Q   P   O
4   N   M   L   K
Vishnudev Krishnadas
  • 10,679
  • 2
  • 23
  • 55