I have large datasets with two dataframes just like - (df1):
data1 = [['cluster1', 351740.00, 0.31],
['cluster3', 401740.00, 0.43],
['cluster1', 511830.00, 0.52],
['cluster2', 601240.00, 0.75],
['cluster2', 167343.00, 0.29],
['cluster3', 509872.00, 0.51]]
df1 = pd.DataFrame(data1, columns= ['ClusterName', 'Column1', 'Column2'])
df1
and (df2):
data2 = [['cluster1', '90% cereal', 0.31, 0.22],
['cluster2', '88% livestock', 0.43, 0.26],
['cluster3', '70% dairy', 0.52, 0.16]]
df2 = pd.DataFrame(data2, columns= ['ClusterNo', 'type', 'other1', 'other2'])
df2
I want to update df1
so that it creates a new column remark
which will return values from df2['type']
whenever df1['ClusterName']
and df2['ClusterNo']
match.
So that the resulting df1
should look like:
ClusterName Column1 Column2 remark
0 cluster1 351740.0 0.31 90% cereal
1 cluster3 401740.0 0.43 70% dairy
2 cluster1 511830.0 0.52 90% cereal
3 cluster2 601240.0 0.75 88% livestock
4 cluster2 167343.0 0.29 88% livestock
5 cluster3 509872.0 0.51 70% dairy