I have the two dataframes that have one column in common: users & apps. Each user can have multiple apps and each app will have name and installation date.
After joining both dataframes I get:
user_id ... app_name ins_date
user1 ... app1 1/1/2020
user1 ... app2 1/7/2020
user2 ... NULL NULL
user3 ... app1 1/5/2020
user4 ... app1 1/3/2020
I want to avoid repeating users but without losing the app details. Is it possible to insert the app's df
inside the user's df
?
Something like:
user_id ... app
user1 ... {'app_name': ['app1', 'app2'], 'ins_date': ['1/1/2020', '1/7/2020']}
user2 ... NULL
user3 ... {'app_name': ['app1'], 'ins_date': ['1/5/2020']}
user4 ... {'app_name': ['app1'], 'ins_date': ['1/3/2020']}
I tried DataFrameGroupBy.apply()
but I can only make a list with one of the columns.
Is there another way?