The follows on from: Pandas - creating 2 new columns based on 2 columns and a separate test column
But it's a different question in it's own right. It should be simpler!
In the referenced question the following one-liner is discussed for data-filling 2 new columns from 2 other columns, and dependent on the value of a third column:
df['Buyer ID'], df['Seller ID'] = zip(
*np.where(df.buy_sell == 'Buy',
(df.buyer_name,df.seller_name),
(df.seller_name,df.buyer_name)).T)
This works well - but when I try to simplify this to use fixed scalar values rather than corresponding values in other columns, it doesn't work.
For example, if I only have one possible buyer, John, and one possible Seller, Maggie, then the follow simpler construct should suffice:
df['Buyer ID'], df['Seller ID'] = zip(
*np.where(df.buy_sell == 'Buy',
("John","Maggie"),
("Maggie","John")).T)
This is failing on the inner np.where() call with:
operands could not be broadcast together with shapes
I've tried a few variations like wrapping the tuples in zip(), which changes the shape but I still get the error. I think the problem is that ("John","Maggie") is not returned as the contents of a single column. The tuple is expanded to mean >1 column?
This link also showed some promise: Changing certain values in multiple columns of a pandas DataFrame at once
But I think the solution assumes the columns you wish to edit already exist and that you only want the same single value placed in every column.
I can get around the problem by making several passes, but it's not ideal:
np.where(df.buy_sell == 'Buy', 'John', 'Maggie')
Ideally for each row, I want a single-pass solution extendible to N new columns being filled with different, fixed, default values, but all depending on a single (boolean) value in another column.
Any pointers on what I'm missing?