I have two dataframes, one main one that I work with and an auxiliary one that I want to bring info in from.
df1
(main) contains a Reporter column with various name strings.
df2
(additional information) contains the reporter name and their location.
I want the location column added as a new column in df1
.
I can do the following as a one off with:
df1 = pd.merge(df1, df2, on='Reporter', how='left')
and it works.
My problem is I run a frequently updating script (checking for new rows and checking for updates on old rows) and running this line of code repeatedly adds multiple columns for each execution.
The trouble with just checking if the column exists is that a new row (that contains a new reporter name) may have been added to the df that I DO want to know/update the location of.
Am I going about this the right way? Or should I do some sort of dict lookup and conditionally map the location each time? How can I do that in pandas?