Which is the Pythonic way for Pandas to obtain the same result as an SQL statement: "UPDATE- LEFT JOIN - SET - WHERE"?

Question

I would like to replicate in Pandas the following SQL structure "Update dataframe1 LEFT JON dataframe2 SET dataframe1.column1 = dataframe2.column2 WHERE dataframe1.column3 > X"

I know it is possible to merge the dataframe and then work on the merged columns with with ".where"

However, it doesn't seems to be straighforward as a solution.

df = pd.merge(df1,df2, suffix(a,b))
df['clmn1'] = df['clmn1_b'].where( df[clmn1]>0, df['clmn1_b'])

Is there a better way to reach the goal? Thanks

Are you joining on a particular index, or on several overlapping columns? — Andy Hayden, May 14 '19 at 22:25
i think the same as WeNYoBen. Both `map` and `replace` would work — Andy L., May 14 '19 at 22:38
@AndyHayden I refer to a generic approach to adopt, some times there are indexes, in other cases there are not indexes. For the general case let's assume you have a column with client name and country, "client list": [Name, Gender, country] and list of field: [[Jack, M, USA], [Nick, M, UK], [Alphio, F, RU], [Jenny, F, USA]] then you have another table with the "abbreviation": [USA: United states, UK, United Kingdom, etc..] .. The use case I have in mind is to update table "client list" replacing the abbreviation with the long country's name, when the gender is Male (M). — m.falconelli, May 15 '19 at 09:41
I found this answer that define a better way (I believe) to replace the ".where" method. [link](https://stackoverflow.com/a/36910033/5806136) using .loc; however apparently is still requiring a first merging of the two DFs and then a work on the new columns — m.falconelli, May 15 '19 at 09:56

score 1 · Accepted Answer · answered May 15 '19 at 16:57

To use your example from the comments:

In [21]: df
Out[21]:
     Name Gender country
0    Jack      M     USA
1    Nick      M      UK
2  Alphio      F      RU
3   Jenny      F     USA

In [22]: country_map =  {'USA': 'United States', 'UK': 'United Kingdom', 'RU': 'Russia'}

In [23]: df.country.map(country_map)
Out[23]:
0     United States
1    United Kingdom
2            Russia
3     United States
Name: country, dtype: object

To update just the M rows you could use loc and update:

In [24]: df.country.update(df[df.Gender == 'M'].country.map(country_map))

In [25]: df
Out[25]:
     Name Gender         country
0    Jack      M   United States
1    Nick      M  United Kingdom
2  Alphio      F              RU
3   Jenny      F             USA

Which is the Pythonic way for Pandas to obtain the same result as an SQL statement: "UPDATE- LEFT JOIN - SET - WHERE"?

1 Answers1