How to pivot, combine in pandas

Question

I have a dataframe of 4X3 and want to pivot and then combine to avoid duplicate intersections.

Column A	Column B	Column C
boo	ptype	123
boo	tecnh	34e
boo	ptype	34w
boo	staaa	45r

I have tried and couldn't pivot nor combine.

combined = line.apply(lambda row: ','.join(row.values.astype(str)), axis=1) (reference from stackoverflow)

Is there a way to pivot and combine to get the results as below?

Column A	ptype	tecnh	staaa
boo	123,34w	34e	45r

sayan dasgupta · Answer 1 · 2022-11-07T11:01:54.727

0

Use pivot_table

res = df.pivot_table(index='Column_A',
               columns='Column_B',
               values='Column_C',
               aggfunc= lambda x:','.join(x))

res = df.pivot_table(index='Column_A', columns='Column_B',
                     values='Column_C', aggfunc= lambda x:','.join(x))
res.columns = res.columns.values
res.reset_index()

edited Nov 07 '22 at 11:01

answered Nov 07 '22 at 10:10

sayan dasgupta

1,084
6
15

How would OP get rid of the `Column_B` index level? – AKX Nov 07 '22 at 10:11
Your question is not clear. `Column_B` is just a name of the index. If you want to get rid of it, it is a matter of saying `res.columns = res.columns.values` Also you can just check `res.columns` to understand it – sayan dasgupta Nov 07 '22 at 10:16
The point is your output has a multi-level index; OP's desired output doesn't. – AKX Nov 07 '22 at 10:57
1

If you check it is not a multi-level index. The column name index just had a name attribute. Anyway updated the answer to reflect exact output. – sayan dasgupta Nov 07 '22 at 11:03
Thanks @sayan dasgupta. Your answer has solved my problem. This is exactly what I want. However, I can do groupby and unstack. – Venkat Nov 07 '22 at 18:01

score 0 · Answer 2 · answered Nov 07 '22 at 10:30

I think you need to do a group by before applying the pivot function. This solves your problem :

# your dataframe
df = pd.DataFrame({'Column A': ['boo', 'boo', 'boo'], 'Column B':['ptype', 'tecnh', 'ptype'], 'Column C' : ['123', '34e', '34w']})
# you need to group by the columns with duplicated rows and aggregate the values column before applying pivot
df = df.groupby(['Column A', 'Column B'])['Column C'].agg(list).reset_index()
# after the groupby, you can apply pivot
df = df.pivot(index='Column A', columns = 'Column B', values = 'Column C').reset_index(drop = True)

Groupby and unstack solved the problem. Thanks @koding_buse for your response. — Venkat, Nov 07 '22 at 17:58

How to pivot, combine in pandas

2 Answers2