-2

I have 3 columns in a Pandas DataFrame where I have to compare 2 columns.

in simple terms.. result_column = not(column1) or (column2)

I tried the following

df['status'] = ~df['offline'] | df['online']

but the above line is resulting in the error.

TypeError: bad operand type for unary ~: 'float'

I searched around for solution and found that '~' is used for Series data structure. There isn't any example for the same for DataFrame. Appreciate your time.

Sudheer G
  • 25
  • 6
  • 1
    Check [ask] and [how to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) – buran Jun 30 '21 at 04:29
  • Looks like `df['offline']` contains `float` values. – ssp Jun 30 '21 at 04:40

1 Answers1

2

Something in your dataframe is not the type of data you are expecting. I'm not sure what data is causing the error. Sometimes null values cause the error you are getting, in which case you can use ', na=False' to fix it. But in this case, I have no problem with floats (2.2) or nulls (np.nan) so I don't know what data would produce that error. See this toy example:

row1list = [True, False]
row2list = [True, True]
row3list = [False, 2.2]
row4list = [False, np.nan]
df = pd.DataFrame([row1list, row2list, row3list, row4list],
                  columns=['column1', 'column2'])

df['result_column'] = ~df['column1'] | df['column2']

print(df)
#    column1 column2  result_column
# 0     True   False          False
# 1     True    True           True
# 2    False     2.2           True
# 3    False     NaN           True

Hammurabi
  • 1,141
  • 1
  • 4
  • 7
  • The error is due to the mixed datatype of column1 which needs to casted to 'bool' datatype. The question is quite trivial once I learnt about data cleaning fundamentals. – Sudheer G Jun 30 '21 at 18:22