0

I have 2 DataFrame with data.

df1 = pd.DataFrame({'User': ['user1', 'user1' 'user2', 'user3'],
                    'Grade': ['XLM', 'YK', 'AAO', 'FRT']})
df2 = pd.DataFrame({'User': ['user1', 'user1', 'user1', 'user2', 'user2', 'user3'],
                    'SocMed': ['Instagram', 'FB', 'Twitter', 'Quora', 'Pinterest', 'Snapchat']})

I want to use pd.merge (or any other command that is probably more appropriate) to get 3rd DataFrame which will look as follows

merged = pd.DataFrame({'User': ['user1', 'user1', 'user2', 'user3'],
                    'Grade': ['XLM', 'YK', 'AAO', 'FRT'],
                    'SocMed': [['Instagram', 'FB', 'Twitter'], ['Instagram', 'FB', 'Twitter'], ['Quora', 'Pinterest'], ['Snapchat']]})

Note: These are samples only. My actual first DataFrame contains 15 columns with ~1000000 rows (370 unique users) and my second one has 600 rows (~350 unique users). This means that for me after the merge some entries will be a null list. I am also fine if I get an 'exploded' dataframe like so:

    User  Grade     SocMed
user1    XLM  Instagram
user1    XLM         FB
user1    XLM    Twitter
user1     YK  Instagram
user1     YK         FB
user1     YK    Twitter
user2    AAO      Quora
user2    AAO  Pinterest
user3    FRT   Snapchat

I have read up on pd.merge and pd.explode but I do not know how to get started.

Nilima
  • 197
  • 1
  • 2
  • 9

0 Answers0