so I want to merge two dataframes that look like this:
df1
id | object | position | x | y | z |
---|---|---|---|---|---|
abc | J | 3 | 0 | 1 | 0 |
abc | C | 13 | 1 | 0 | 0 |
cde | C | 2 | 0 | 1 | 0 |
etc etc (df1 is much smaller than df2)
df2
id | object | position | bla | bla | |
---|---|---|---|---|---|
1 | abc | D | 1 | bla | bla |
2 | abc | S | 2 | bla | bla |
3 | abc | J | 3 | bla | bla |
4 | abc | J | 4 | bla | bla |
... | ... | ... | ... | ... | ... |
12 | abc | X | 12 | bla | bla |
13 | abc | C | 13 | bla | bla |
... | ... | ... | ... | ... | ... |
45 | cde | J | 1 | bla | bla |
46 | cde | C | 2 | bla | bla |
etc etc etc
what I want is a code that can merge x y z from df1 into df2, using id, object and position to correctly map x y z, so my output would look like this
id | object | position | bla | bla | x | y | z | |
---|---|---|---|---|---|---|---|---|
1 | abc | D | 1 | bla | bla | 0 | 0 | 0 |
2 | abc | S | 2 | bla | bla | 0 | 0 | 0 |
3 | abc | J | 3 | bla | bla | 0 | 1 | 0 |
4 | abc | J | 4 | bla | bla | 0 | 0 | 0 |
... | ... | ... | ... | ... | ... | ... | ... | |
12 | abc | X | 12 | bla | bla | 0 | 0 | 0 |
13 | abc | C | 13 | bla | bla | 1 | 0 | 0 |
... | ... | ... | ... | ... | ... | ... | ... | |
45 | cde | J | 1 | bla | bla | 0 | 0 | 0 |
46 | cde | C | 2 | bla | bla | 0 | 1 | 0 |
so the stuff in bold are lines that are found in df1 where x, y, and z can be copied.
no matter what I try x y z in the merged tables ends up being full of 0s.
tried:
merged_df = df2.merge(df1, left_on=['id', 'Column A', 'position'], right_on=['id', 'object', 'position'], how='left')
merged_df = merged_df.drop(columns=['object'])
merged_df = merged_df.rename(columns={'x_x': 'x', 'y_x': 'y', 'z_x': 'z'})
merged_df = merged_df.fillna(0)
Everything ends up being 0
also tried:
merged_df = df1.merge(df2, on=['id', 'object', 'position'], how='left')
test_set_df['index'] = merged_df['index']
max_index_value = df2.index.max()
for index_value in range(1, max_index_value + 1):
if index_value not in df1['index'].values:
new_row = pd.DataFrame({'id': [''], 'object': [''], 'position': [''], 'x': [0], 'y': [0], 'z': [0], 'index': [index_value]})
df1 = pd.concat([df1, new_row], ignore_index=True)
df1.sort_values(by='index', inplace=True)
df1.reset_index(drop=True, inplace=True)
df2.rename(columns={'x': 'x_alpha', 'y': 'y_alpha', 'z': 'z_alpha'}, inplace=True)
df2_annotated = df2
df2_annotated = df2_annotated.merge(test_set_df[['index', 'x', 'y', 'z']], on='index', how='left')