1

trying to merge two data frame in panda "table A" and "table B"."Table A" has 200K rows and table B" has 310K rows. Once merge I want the row of "table A" to stay 200K. I try to use the left, right, outer merge and for some reason the rows of table A doesn't stay at 200K `

This is the table that I have and I'm trying to merge 

   table A              
    ID     pass    date       Suitcase          Layover
    500    yes   2/22/2018         1           yes
    501    no    2/23/2018         3           yes
    502    yes   2/24/2018         5           yes
    504    yes   2/25/2018         2           no
    505    yes  2/26/2018          1           yes
    506    no   2/27/2018          2           no
    507    yes  2/28/2018          5           no
    560    yes  3/15/2019          2           yes       

   Table B          
    ID     time_travel          country 
   500         4                 USA    
   504         3                 MEXICO 
   507         1                 Canada 
   621         2                 Australia  
  3345         3                South Africa    
  7755         2                 France 
  3385         1                California  
    merging2 = pd.merge(table A,Table B, on=["id"], 
    how="left",indicator=True)
    merging2.head()




``` the goal is to have the column ID stay at 200k and to have a table that 
    looks like this 

    id   pass     date      Suitcase    Layover   time_travel   country
    500  yes      2/22/2018   1          yes         4            USA
    501  no       2/23/2018   3          yes         
    502  yes      2/24/2018   5          yes        
    504  yes      2/25/2018   2          no          3           MEXICO
    505  yes      2/26/2018   1          yes        
    506  no       2/27/2018   2          no     
    507 yes       2/28/2018   5          no                      Canada
cesco
  • 93
  • 1
  • 7
  • https://stackoverflow.com/questions/53645882/pandas-merging-101 – BENY May 02 '19 at 01:56
  • I try the example shown but it returning less value with which is weird – cesco May 02 '19 at 02:29
  • 1
    it's normal when there are duplicate `ID` in tableB. to keep # of records in tableA using `LEFT JOIN`, you can deduplicate tablesB based on ID using `drop_duplicates` or `groupby()`. – jxc May 02 '19 at 19:49

0 Answers0