1

I have a dataframe with columns as first ,col1,col2,col3,row1 import pandas as pd

df = pd.DataFrame({'first': ['1', '2', '3', '4', '5', '6', '7', '8'], 'col1': ['yes', 'yes', 'yes', 'yes', 'no', 'no', 'no', 'no'], 'col2': ['yes', 'yes', 'no', 'no' ,'yes', 'yes' ,'no', 'no'], 'col3': ['yes', 'no', 'yes', 'no','yes', 'no','yes', 'no'], 'row1':['4','5','7','3','7','3','79','8']})

first col1 col2 col3 row1 1 yes yes yes 4 2 yes yes no 5 3 yes no yes 7 4 yes no no 3 5 no yes yes 7 6 no yes no 3 7 no no yes 79 8 no no no 8

i want to apply conditions as below and create a new column as new_one

df.loc[((df['col1']=='no') & (df['col2']=='no') & (df['col3'] == 'no')),'new_one'] = 'no'
df.loc[~((df['col1']=='no') & (df['col2']=='no') & (df['col3'] == 'no')),'new_one'] = 'yes`

Problem: number of columns in the dataframe is dynamic and it is not constant as 3 in the above data frame but it matches with a pattern , here it is 'col*'

i want to select columns based up on the pattern(here col*) and apply the condition ('== 'no'') to each column and '&' them, and create a new column as new_one

i implemented the below code but not working

col_list=[]
for name in df.columns:
    if(re.search("col",name))
        col_list.append(name)
count=0
cond=''
for col in col_list:
   if(count==0):
      cond+="(df[\'"+col+"\'] == 'no')"
   else:
      cond+="&(df[\'"+col+"\'] == 'no')"
   count=count+1

df.loc[cond,'new_one'] = 'no'
df.loc[~cond,'new_one'] = 'no'

problem: cond is a str , i dont know how to use that as condition as df.loc expects cond to be series.i tried eval as well but no luck.please let me know any solution to this?

first col1 col2 col3 row1 new_one
1  yes  yes  yes    4     yes
2  yes  yes   no    5     yes
3  yes   no  yes    7     yes
4  yes   no   no    3     yes
5   no  yes  yes    7     yes
6   no  yes   no    3     yes
7   no   no  yes   79     yes
8   no   no   no    8      no
sandeep
  • 123
  • 7

2 Answers2

1

Use all/any:

df.loc[df[['col1', 'col2', 'col3']].eq('no').all(axis=1), 'new_one'] = 'no'
df.loc[df[['col1', 'col2', 'col3']].ne('no').any(axis=1), 'new_one'] = 'yes'

Or, for both yes/no in a single command, numpy.where:

df['new_one'] = np.where(df[['col1', 'col2', 'col3']].eq('no').all(axis=1),
                         'no', 'yes')

Output:

  first col1 col2 col3 row1 new_one
0     1  yes  yes  yes    4     yes
1     2  yes  yes   no    5     yes
2     3  yes   no  yes    7     yes
3     4  yes   no   no    3     yes
4     5   no  yes  yes    7     yes
5     6   no  yes   no    3     yes
6     7   no   no  yes   79     yes
7     8   no   no   no    8      no

dynamic number of columns:

mask = df.filter(like='col').eq('no').all(axis=1)

df['new_one'] = np.where(mask, 'no', 'yes')
mozway
  • 194,879
  • 13
  • 39
  • 75
  • thanks for the answer , problem is number of columns matches with pattern col* and it is dynamic – sandeep Jan 26 '23 at 09:47
  • @sandeepunna then `df.filter(like='col').eq('no').all(axis=1)` – mozway Jan 26 '23 at 09:49
  • @mozway Despite the question problem, your answer gave me a good tutorial, and I learned something about Pandas DataFrame manipulation. This problem sounds to me like *logic* `XOR/XNOR` applications possibly over _multiple_ columns within DataFrame [Ref1](https://stackoverflow.com/questions/35043739/logical-operation-on-two-columns-of-a-dataframe), [ref2](https://stackoverflow.com/questions/69052548/xor-operator-in-pandas-query), [ref3](https://stackoverflow.com/a/70264768/10452700) but I'm not sure if it can be used for _dynamic number of columns_ scenario. – Mario Jan 26 '23 at 15:21
  • @Mario yes you can use `np.logical_and.reduce(df.filter(like='col').eq('no'), axis=1)` ;) – mozway Jan 26 '23 at 15:53
0

Using lambda here you counting the number of time 'no' occurrence.

df['new'] = df[['col1','col2','col3']].T.apply(lambda r: 'no' if list(r).count('no')==3 else 'yes')
R. Baraiya
  • 1,490
  • 1
  • 4
  • 17