0

I have a data frame that contains multiple columns that identify if a "Yes" or "No" response based on various conditions. Table_Exmaple

I am trying to create a new column that counts the "Yes" responses across these conditions. New column: Table_with_new_Count_Column

I tried the solution from this other question Efficient way to count string values across multiple columns to create new total column

and variations of it from other posts, but keep getting errors, most commonly:

df2['Count'] = df2.iloc[:, 0:9].eq('Yes').sum(axis=1)

<ipython-input-21-189874c017d3>:1: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
df2['Count'] = df2.iloc[:, 0:9].eq('Yes').sum(axis=1)

I've looked everywhere and am probably missing something simple. Any help would be greatly appreciated.

Craicerjack
  • 6,203
  • 2
  • 31
  • 39
  • I don't think there's issue in this code snippet. It would be better if you can provide value of data. As it's correct and giving desired output also. – GodWin1100 Jun 30 '22 at 14:08

3 Answers3

0

Basically, construct a new column or columns with numeric values to represent yes or no. So yes=1, no=0. Then can add those to get a total or whatever else is desired.

This would work:


import pandas as pd

# create a dtaframe example
d = {'col_1':['yes', 'yes', 'no', 'yes'],
    'col_2':['yes', 'yes', 'no', 'yes']
}

df = pd.DataFrame(d)

# create a function to test string
def yes_or_no(x):
    if x=='yes':
        return 1
    if x=='no':
        return 0
    return 'error'

# apply the function
df['col_3']=df['col_1'].apply(yes_or_no)

# print the result
y = df['col_3'].sum()
print('number of yes', y)

result:

number of yes 3
D.L
  • 4,339
  • 5
  • 22
  • 45
0

Try below, this is tested and working

import pandas as pd

# create a dtaframe example
d = {'c1': ['Yes', 'Yes', 'No', 'Yes'],
     'c2': ['Yes', 'Yes', 'No', 'Yes'],
     'c3': ['Yes', 'Yes', 'No', 'No'],
     'c4': ['Yes', 'Yes', 'Yes', 'Yes'],
     'c5': ['No', 'Yes', 'No', 'Yes']
     }

df = pd.DataFrame(d)

df['count'] = df.iloc[:, :].eq('Yes').sum(axis=1)

print(df)
Smaurya
  • 167
  • 9
0

Solutions provided so far I have gotten the same response. However, I found this other post speaking directly to the error message I received. I tried that and it worked for my original code. Thank you all. I appreciate your guidance.

Other post for reference.

How to deal with SettingWithCopyWarning in Pandas

  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Jul 02 '22 at 02:52