I have two dataframes, df1 and df2. shapes of df1 is (118093,2) and df2 is (690,5)
I am trying to merge the dataframes using:
df3 = df1.merge(df2, on='consId', how='left')
I have removed the duplicates in df2(as suggested here) and I know for a fact that consId in df1 is repeated(although the entire row is not duplicated).
The behaviour I am expecting is that the resulting df3 should have the same number of rows as df1 but with the columns of df2 appended to df1.
My dataframes are as follows:
df_consId[18:25]
Out[22]:
consId consumption
18 1155696024483 97.203390
19 1155696024483 80.527278
20 1155696024483 78.953125
21 1155699007557 137.419355
22 1155699007557 117.145161
23 1155699007557 101.500000
24 1155699007557 85.883333
df.head()
Out[23]:
consId rfrg-age rfrg-rating rfrg-type rfrg-capc
0 1155696024483 812yr 1.0 1door 150L
1 1155699007557 812yr 1.0 1door 150L
2 1155694005571 48yr 1.0 2door 250L
3 1155691016680 NaN NaN NaN NaN
4 1155697016945 13yr 5.0 1door 150L
And I do df3 = df_consId.merge(df, on= 'consId', how = 'left')
Although the merge is working as I intended, there are additional rows in df3 although there are no duplicates in df. Any reason why?