0

Let's say I have two dataframes in Pandas in Python that look something like this:

df_test_1 = pd.DataFrame(data=None, columns=['file', 'number'])
df_test_1.file = ['file_1', 'file_2', 'file_3', 'file_23', 'file_24', 'file_25']
df_test_1.number = [1, 2, 3, 23, 24, 25]

df_test_2 = pd.DataFrame(data=None, columns=['file', 'new_file', 'number'])
df_test_2.file = ['file_4', 'file_5', 'file_6', 'file_41', 'file_42', 'file_43']
df_test_2.new_file = ['file_2', 'file_3', 'file_1', '', '', '']
df_test_2.number = [4, 5, 6, 41, 42, 43]

What I want is to then change the 'number' column in 'df_test_2' to equal the 'number' in 'df_test_1' for the corresponding 'file' in 'df_test_1' seen in 'new_file' in 'df_test_2' if it exists. Simply put, if df_test_2.file = 'file_4' and df_test_2.new_file = 'file_2', a new column could show df_test_2.new_number = 2, based on the 'number' in the df_test_1.file = 'file_2' row. The final column should look like:

[2, 3, 1, , , ]

My idea was something like this:

df_test_2['new_number'] = df_test_1.loc[df_test_1.file == df_test_2.new_file, 'number']

But it does not work. Any help would be very appreciated, thanks!

Marcus K.
  • 301
  • 1
  • 3
  • 9
  • 2
    Use `df_test_2['new_number'] = df_test_2.new_file.map(df_test_1.set_index('file')['number'])` – jezrael May 19 '23 at 09:51

0 Answers0