Concatenate cross tab values with dataframe?

Question

I have a dataframe df of the following type:

ID   Result Other_val
1    A      y
2    B      x
2    A      x
3    C      abc

After using pd.crosstab(df.ID, df.Result), I get a crosstab like this:

Result  A  B  C
ID
1       1  0  0
2       1  1  0
3       0  0  1

I wish to now concatenate these values to the end of my original dataframe (after removing the already present Result column) to get something like this:

ID   A   B   C   Other_val
1    1   0   0   y
2    1   1   0   x
3    0   0   1   abc

However, I am stumped. I can't seem to use pd.concat() to form the above table because of the strange way the cross-tab table is indexed.

Any help?

`Other_val` could also be `x y abc`, or am I wrong? How do you decide about the order there? — Cleb, Dec 16 '17 at 00:54
Hope the link help https://stackoverflow.com/questions/47152691/how-to-pivot-a-dataframe/47152692#47152692 — BENY, Dec 16 '17 at 01:50

louis_guitton · Accepted Answer · 2017-12-16T01:10:49.340

3

I think what you're looking for is pd.crosstab([df.ID, df.Other_val], df.Result), because you need to group ID and Other_val.

In [5]: pd.crosstab([df.ID, df.Other_val], df.Result)
Out[5]: 
Result        A  B  C
ID Other_val         
1  y          1  0  0
2  x          1  1  0
3  abc        0  0  1

edited Dec 16 '17 at 01:10

answered Dec 16 '17 at 01:04

louis_guitton

5,105
1
31
33

1

That's far cleaner than mine and should probably the accepted answer (+ 1); did not even think about it but just blindly merged the dataframes... ;) You could still add a `.reset_index()`, if you like... – Cleb Dec 16 '17 at 01:30
1

That's why I at least upvoted your answer and also recommended to accept yours (see comment below my answer). – Cleb Dec 16 '17 at 01:34
I just came across this! Much cleaner and efficient- thank you – mDe Dec 16 '17 at 19:57

Cleb · Answer 2 · 2017-12-16T01:14:38.183

You can do

df2 = pd.crosstab(df.ID, df.Result)

and then either

df_final = df.drop('Result', axis=1).drop_duplicates('ID').join(df2, on='ID')

which gives

   ID Other_val  A  B  C
0   1         y  1  0  0
1   2         x  1  1  0
3   3       abc  0  0  1

If you want to reorder the columns you obtain your desired outcome

df_final = df_final[['ID', 'A', 'B', 'C', 'Other_val']]

   ID  A  B  C Other_val
0   1  1  0  0         y
1   2  1  1  0         x
3   3  0  0  1       abc

You can also use concat like this

pd.concat([df.drop('Result', axis=1).drop_duplicates('ID').set_index('ID'), df2], axis=1)

   Other_val  A  B  C
ID                   
1          y  1  0  0
2          x  1  1  0
3        abc  0  0  1

@mDe: Glad it helped; you might want to look at laguittemh's answer though, which seems far cleaner (you can always unaccept my answer and accept his). I did not even think about this but just blindly merged the dataframes... :) — Cleb, Dec 16 '17 at 01:32

Concatenate cross tab values with dataframe?

2 Answers2