I have data in the shape as follows:
pd.DataFrame({'id': [1,2,3], 'item': ['item_a', 'item_a', 'item_b'],
'score': [1,-1,1]})
id item score
1 item_a 1
2 item_a -1
3 item_b 1
I want to get dummy codes for the item column, but I want them scored based on their values in the score column. If there are no observations, I want a 0 imputed. Like so:
id item_a item_b
1 1 0
2 -1 0
3 0 1
As you see, I want to capture that user id 1 liked item_a, that id 2 disliked item_a, and that user 3 did not interact with item_a. The id column is not unique per row - for example, user id 3 could have liked item_a, and that would be recorded as a new row in the original dataframe.
I've tried using get_dummies in pandas, but that method only calculates the number of observed values in the "item" column, it doesn't take into account the score values.