1

This is something I have:

list1_ = [("1","a","a1"),("1","b","b1"),("1","c","c"),("2","a","a2")]
df1 = pd.DataFrame(list1_,columns = ["user","col1","col2"])
list2_ = [("1","b","b2"),("1","a","a2"),("2","a","a3"),("1","c","c2")]
df2 = pd.DataFrame(list2_,columns = ["user","col1","col3"])

What I am trying to do is is for (user,col1) in df2 match the pair with df1 and add col3 in df1... basically make df1: (user, col1,col2,col3) for the same cell values. The end result should look like this:

list3_ = [("1","a","a1","a2"),("1","b","b1","b2"),("1","c","c","c2"), 
("2","a","a2","a3")]
df3 = pd.DataFrame(list3_,columns = ["user","col1","col2","col3"])

Please note: I read df1 from a csv file, and I create df2 using list2_. Therefore, I have some data in the form of list2_ but not in the form of list1_. So, would like to use only df1, list2_ and/or df2.

1 Answers1

5

Use pd.merge:

df1.merge(df2, on = ['user','col1'])

   user col1 col2 col3
0    1    a   a1   a2
1    1    b   b1   b2
2    1    c    c   c2
3    2    a   a2   a3
yatu
  • 86,083
  • 12
  • 84
  • 139
  • 1
    _Most_ questions involving `merge` (including this one) can now be hammered as duplicate of [Merging 101](https://stackoverflow.com/questions/53645882/pandas-merging-101?noredirect=1&lq=1). Encouraging all users to do so, let's keep this tag clean :) – cs95 Dec 11 '18 at 01:44
  • Thank you @coldspeed for the comment, but hadn't come across it earlier – Avantika Banerjee Dec 11 '18 at 11:49
  • No none of them are. And even though I would not agree on a solution you may give at some point I don't think I've ever downvoted you, as generally your solutions are both pretty fast and neat – yatu Dec 13 '18 at 15:04
  • @nixon - It is really bad... So somebody want I think you are downvoter :( It is really bad :( – jezrael Dec 13 '18 at 15:07
  • well @jezrael good enough you now know that those are not mine :) – yatu Dec 13 '18 at 15:09