0

I have large datasets with two dataframes just like - (df1):

data1 = [['cluster1', 351740.00, 0.31], 
        ['cluster3', 401740.00, 0.43], 
        ['cluster1', 511830.00, 0.52], 
        ['cluster2', 601240.00, 0.75], 
        ['cluster2', 167343.00, 0.29], 
        ['cluster3', 509872.00, 0.51]]

df1 = pd.DataFrame(data1, columns= ['ClusterName', 'Column1', 'Column2'])
df1

and (df2):

data2 = [['cluster1', '90% cereal', 0.31, 0.22], 
        ['cluster2', '88% livestock', 0.43, 0.26], 
        ['cluster3', '70% dairy', 0.52, 0.16]]

df2 = pd.DataFrame(data2, columns= ['ClusterNo', 'type', 'other1', 'other2'])
df2

I want to update df1 so that it creates a new column remark which will return values from df2['type'] whenever df1['ClusterName'] and df2['ClusterNo'] match.

So that the resulting df1 should look like:

    ClusterName Column1 Column2 remark
0   cluster1    351740.0    0.31    90% cereal
1   cluster3    401740.0    0.43    70% dairy
2   cluster1    511830.0    0.52    90% cereal
3   cluster2    601240.0    0.75    88% livestock
4   cluster2    167343.0    0.29    88% livestock
5   cluster3    509872.0    0.51    70% dairy

0 Answers0