I currently have a piece of code which uses np.where to check if each row in a dataframe matches certain conditions, then sets a new column value.
df['new_column'] = np.where(((df['col_1'].str.contains('a')) & (df['col_2'].str.contains('b')) | ((df['col_1'].str.contains('c')) & (df['col_2'].str.contains('d')), "Yes", "No")
As you can see, this isnt very clear, or maintainable as this definition of Yes and No for the new_column changes. I was thinking about using a dictionary to define what Yes and No would be and then use map for the new column, but not sure how to do this as Yes and No are based on multiple column values.
Any suggestions here for improving this existing code?
EDIT: @JANO - this is a slight improvement I think as its a little more structured and maintainable. Let me give an example of what I am trying to do. Basically, lets say we have the below dataframe, and we want to create a new column 'High Paying Job'.
field profession
0 Medical Surgeon
1 Medical Home Health Aid
2 Sports Scout
3 Sports Athlete
4 Education Teacher
5 Education Principal
If we make some assumptions, then I would want the new column to have the following values:
field profession high paying job
0 Medical Surgeon Yes
1 Medical Home Health Aid No
2 Sports Scout No
3 Sports Athlete Yes
4 Education Teacher No
5 Education Principal Yes