I have the following df:
id step1 step2 step3 step4 .... stepn-1, stepn, event
1 a b c null null null 1
2 b d f null null null 0
3 a d g h l m 1
Where the id is a session, the steps represent a certain path, and event is whether something specific happened
I want to create a feature store where we take all the possible steps (a, b, c, ... all the way to some arbitrary number) and make them the columns. Then I want the x-column to remain the id
and it just fill a 1 or zero if that session hit that step in the column. The result is below:
id a b c d e f g ... n event
1 1 1 1 0 0 0 0 0 1
2 0 1 0 0 0 1 0 0 0
3 1 0 0 1 0 0 1 1 1
I have a unique list of all the possible steps which I assume will be used to construct the new table. But after that I am struggling thinking how to create this.