Python: Maintain multiple columns with np.where

Question

Is it possible to maintain multiple columns at once using np.where? usually, one column is maintained with np.where, so my coding would look like this:

df['col1] = np.where(df[df.condition == 'yes'],'sth', '')
df['col2'] = np.where(df[df.condition == 'yes'], 50.00, 0.0)

But caused by the fact that I test for the same condition twice, I was wondering, if I can pass 2 columns and fill them both in one run.

I tried this:

df['col1','col2'] = np.where(df[df.condition == 'yes'],['sth',50.00], ['',0.0])

But it does not work. Is there a way to realize this?

Thanks :)

What do you mean by "maintain multiple columns at once"? Can you show a sample of the data you have and the result you are trying to achieve? — jdehesa, Mar 23 '18 at 10:15
Possible duplicate of [Numpy where function multiple conditions](https://stackoverflow.com/questions/16343752/numpy-where-function-multiple-conditions) — Rafael, Mar 23 '18 at 10:15
@jezrael, I'm struggling to understand this question. If you understand it, would mind editing the question? I can see this confusing many users, not just me! — jpp, Mar 24 '18 at 21:05

jezrael · Accepted Answer · 2018-03-25T13:15:30.810

I think need reshape boolean mask to (N x 1):

m = df.condition == 'yes'
df[['col1','col2']] = pd.DataFrame(np.where(m[:, None], ['sth',50.00], ['',0.0]))

Only disadvatage of solution is if different types of values in lists - numeric with strings - then numpy.where both output columns convert to strings.

Sample:

df = pd.DataFrame({'A':list('abcdef'),
                     'condition':['yes'] * 3 + ['no'] * 3})

print (df)
   A condition
0  a       yes
1  b       yes
2  c       yes
3  d        no
4  e        no
5  f        no

m = df.condition == 'yes'
df[['col1','col2']] = pd.DataFrame(np.where(m[:, None], ['sth',50.00], ['',0.0]))
print (df)
   A condition col1  col2
0  a       yes  sth  50.0
1  b       yes  sth  50.0
2  c       yes  sth  50.0
3  d        no        0.0
4  e        no        0.0
5  f        no        0.0

print (df.applymap(type))
               A      condition           col1           col2
0  <class 'str'>  <class 'str'>  <class 'str'>  <class 'str'>
1  <class 'str'>  <class 'str'>  <class 'str'>  <class 'str'>
2  <class 'str'>  <class 'str'>  <class 'str'>  <class 'str'>
3  <class 'str'>  <class 'str'>  <class 'str'>  <class 'str'>
4  <class 'str'>  <class 'str'>  <class 'str'>  <class 'str'>
5  <class 'str'>  <class 'str'>  <class 'str'>  <class 'str'>

EDIT: I test it with NaNs values:

df = pd.DataFrame({'A':list('abcdefghi'),
                     'condition':['yes'] * 3 + ['no'] * 3 + [np.nan] * 3})

m = df.condition == 'yes'
df[['col1','col2']] = pd.DataFrame(np.where(m[:, None], ['sth',50.00], ['',0.0]))
print (df)
   A condition col1  col2
0  a       yes  sth  50.0
1  b       yes  sth  50.0
2  c       yes  sth  50.0
3  d        no        0.0
4  e        no        0.0
5  f        no        0.0
6  g       NaN        0.0
7  h       NaN        0.0
8  i       NaN        0.0

thank you, this almost works. Any idea, why it does not work for the last 6 rows? col1 and col2 both contain nan within these last rows, all other rows have valid content. — MaMo, Mar 25 '18 at 12:54
@MaMo - hmmm, it seems there is problem need another condition for exclude `NaNs`. — jezrael, Mar 25 '18 at 13:01
@MaMo - I add sample to answer with `NaN`s and it working nice. Can you test it? Or I dont understand what you need? — jezrael, Mar 25 '18 at 13:16
I get `NaN`s in col1 and col2 for last few rows, not in the condition column. I so not know the reason because there is nothing special about those rows or the condition column. — MaMo, Mar 26 '18 at 11:22
yes, they are, but problem solved. It has been my mistake, the df I used has been a subset of another df so the indexing has been the problem. Added `df.reset_index()` before executing the `np.where`-Coding and voilà :) Thanks again! — MaMo, Mar 26 '18 at 13:08

Python: Maintain multiple columns with np.where

1 Answers1

Linked