Let say I have a dataframe df1 with columns are like [user_id: string, app_name: string, frequency: float]. Now I want to get a df2 matrix representation Users x Apps, where each entry indicates the frequency. In other words, df2 would have columns that look like [user_id, frequency_app_1, frequency_app_2, ... frequency_app_n] Also, how could I handle the data when n<1000 and n>10000?
Since I don't know the keyword for this problem, a clue would also be helpful.
Let say dataset is like below
user_id app_name frequency
id001 app4 324.54
id005 app4 786.56
id002 app3 87.79
id002 app4 423.76
id003 app1 23.34
id001 app2 34.76
id001 app1 796.89
id004 app6 324.12
id002 app1 32.34
id004 app3 124.98
id003 app1 324.98
id003 app1 434.5
id001 app1 89.54
id001 app4 54.51
id002 app2 544.87
id003 app3 76.0
id002 app1 32.32
id001 app1 654.56
The expected outcome should look like
user_id frequency_app_1 frequency_app_2 frequency_app_3 frequency_app_4
id001 1629.1 34.76 N/A 379.05
...