I have this csv file below (products rated by users) which into pandas dataframe:
--------------------------------
User_id | Product_id | Rating |
--------------------------------
1 | 00 | 3 |
1 | 02 | 5 |
2 | 01 | 1 |
2 | 00 | 2 |
2 | 02 | 2 |
I want to change the dataframe so that it has the same number of rows as the source table above, but only two columns:
- Column 1: needs to be a list of L length (L = total number of existing kinds of products), and where the n-th value (n = product_id) in the list is the rating given by the user in this row to the product. All all other values in the list need to be zeros
- column 2 should be a list of the same L length, where the n-ths values equal to ratings for n-ths products (n = product_id) for all product_ids rated by this user (in the entire table); all other (unrated) values that are not rated by the user need to be zeros
The desired result would be (consistent with the example above):
--------------------------------
User_id | col1 | col2 |
--------------------------------
1 | [3,0,0] | [3,0,5] |
1 | [0,0,5] | [3,0,5] |
2 | [0,1,0] | [2,1,2] |
2 | [2,0,0] | [2,1,2] |
2 | [0,0,2] | [2,1,2] |
I will greatly appreciate any help with this. Please do ask questions if i can make the question & explanation more clear.