I have two dataframes. First df_1
one is with data states with following format:
session_id | user_id | timestamp | state1 | state2
And the second one is with another dataframe df_2
states with format:
session_id | user_id | timestamp | state3 | state4
I want to join them into final dataframe with following structure:
session_id | user_id | timestamp | state1 | state2 | state3 | state4
I want to keep rows from both frames not just overlaying. Something like this:
session_id user_id timestamp state1 state2 state3 state4
0 1 1 14:00 1 1 0 1
1 1 2 14:00 NAN NAN 0 1
2 1 3 14:00 1 0 NAN NAN
Basically I think it should be a outer join
. So I studied documentation and come up with this:
df_1.set_index(['session_id', 'user_id', 'timestamp']).join(
df_2.set_index(['session_id', 'user_id', 'timestamp']), how="outer")
But it is not working right way. It just puts dataframes one under another. What should I do ?