How to insert a data frame inside a dataframe Pandas

Question

I have the two dataframes that have one column in common: users & apps. Each user can have multiple apps and each app will have name and installation date.

After joining both dataframes I get:

user_id ...  app_name  ins_date
user1   ...   app1      1/1/2020
user1   ...   app2      1/7/2020
user2   ...   NULL      NULL
user3   ...   app1      1/5/2020
user4   ...   app1      1/3/2020

I want to avoid repeating users but without losing the app details. Is it possible to insert the app's df inside the user's df?

Something like:

user_id ...  app
user1   ...   {'app_name': ['app1', 'app2'], 'ins_date': ['1/1/2020', '1/7/2020']}
user2   ...   NULL      
user3   ...   {'app_name': ['app1'], 'ins_date': ['1/5/2020']}
user4   ...   {'app_name': ['app1'], 'ins_date': ['1/3/2020']}

I tried DataFrameGroupBy.apply() but I can only make a list with one of the columns.

Is there another way?

Please include the raw data for each dataframe and the code used to generate your combined dataframe. Please [provide a reproducible copy of the DataFrame with `to_clipboard`](https://stackoverflow.com/questions/52413246/how-do-i-provide-a-reproducible-copy-of-my-existing-dataframe) — Trenton McKinney, Oct 04 '19 at 01:50

score 0 · Answer 1 · answered Oct 04 '19 at 02:13

You can try this

df.groupby('user_id').apply(lambda x: {'app_name': x['app_name'].tolist(), 'ins_date': x['ins_date'].tolist()})

output

user_id
user1    {'app_name': ['app1', 'app2'], 'ins_date': ['1/1/2020', '1/7/2020']}
user2    {'app_name': [nan], 'ins_date': [nan]}                              
user3    {'app_name': ['app1'], 'ins_date': ['1/5/2020']}                    
user4    {'app_name': ['app1'], 'ins_date': ['1/3/2020']}                    
dtype: object

Update: used pd.set_option('display.max_colwidth', -1) to display max col width

How to insert a data frame inside a dataframe Pandas

1 Answers1