I have data that looks like this:
person_id funnel_iteration stage_name timestamp
1 1 aql 1
1 1 tal 2
1 1 tql 3
1 1 aql 4
1 2 suspect 1
1 2 prospect 2
2 1 sal 1
2 1 qso 2
that I want to pivot to look like this:
person_id funnel_iteration 1 2 3 4 5 6 7
1 1 aql tal tql aql
1 2 suspect prospect
2 1 sal qso
the columns that contain the stages should extend horizontally to include the max number of stages for the person_id, funnel_iteration combo that contains the most rows
I have funnel data partitioned by person and funnel iteration and I'm trying to group them into patterns based on order of stages and then get a total count for each pattern occurence.
for example:
12,323, aql, tal, tql, aql
24,312, aql, tal, tql, sal, qso
23, 343 tal, tql, sal, qso
55,050 suspect, prospect, aql, tal, tql, sal, qso, reject
102,323 suspect, prospect, aql, prospect, tal, tql, sal, qso, reject
Right now, I'm struggling on how best to pivot the data with these constraints. I've tried using pandas pivot_table but haven't had luck and keep running into problems.