Checking rows value and updating corresponding value within a column

Question

I am having some issues to change values in a column. My dataset has the following columns

Index(['Date', 'Name', 'Surname', 'Verified Account', 'Col1', 'Col2', 'Col3',
       'Col4', 'Col5'],
      dtype='object')

'Verified Account', 'Col1', 'Col2', 'Col3','Col4', 'Col5' have rows with value 'True'/'False'. Right now 'Verified Account' is based only on values from Col1. I would like to update it including all the other columns, i.e.:

if 'Col1' or 'Col2' or 'Col3' or 'Col4' or 'Col5' have values 'True', then 'Verified Account' has value 'True' else 'Verified Account' has value 'False'.

I have tried with:

df['Verified Account'] = df.apply(lambda x: 1 if df['Col1']=='True' or df['Col2']== 'True' or df['Col3']=='True' or df['Col4']== 'True' or df['Col5']=='True' else 'False')

but I have got the following error:

TypeError: Cannot perform 'rand_' with a dtyped [bool] array and scalar of type [bool]

How can I fix it?

score 1 · Answer 1 · answered Apr 19 '20 at 14:47

1

Let us do

df['Verfied Account']=df[['col1'...]].any(1)

answered Apr 19 '20 at 14:47

BENY

317,841
20
164
234

score 0 · Accepted Answer · answered Apr 19 '20 at 15:23

I saw 3 points:

First, you need to care about the types. Here, the data seems to be string. Is it the desired type ? That doesn't seem since you'r in stringe manipulating boolean in string. The first step is maybe to convert this columns to boolean. This discussion explains how. You can try for the following for all these columns:

df['column_name'].map({"True": True, "False": False})

The second point is that you're using a apply with a lambda function. But the problem is that you're not using the x value inside but the whole df. So you need to replace all the df variable with the x variable. Here is some reading about apply and lambda.
The last point is that the apply function has an argument axis to know how to iterate : over columns or over rows. By default, it's over columns. But here, obviously, the operation needs to be performed over row. So axis=1 is required.

The snippet becomes:

df['Verified Account'] = df.apply(lambda x: True if x['Col1']==True or x['Col2']== True or x['Col3']==True or x['Col4']== True or x['Col5']==True else False)

Further improvement:

The lambda might be simplified by returning only the if condition

df['Verified Account'] = df.apply(lambda x: x['Col1']== True or 
                                            x['Col2']== True or 
                                            x['Col3']== True or
                                            x['Col4']== True or 
                                            x['Col5']==True)

There are more efficient way to perform this. The any has been design for. Don't forget to precise the axis used, here the row (e.g. axis=1). You can try:

df['Verified Account'] = df[["Col1", "Col2", "Col3", "Col4", "Col5"]].any(1)

Checking rows value and updating corresponding value within a column

2 Answers2