Override default True == 1, False == 0 behavior

Question

I have dataframes that can contain a mix of booleans and integers, and I'd like to be able to do things like df_1 == df_2.loc[0,0], and guarantee that if df_2.loc[0,0] is 1 that it won't match True values in df_1.

Could you elaborate on what you mean by "I have dataframes that can contain a mix of booleans and integers"? What dtypes? object? — Brian61354270, Jul 28 '23 at 00:01
The dataframes are formed from csv data which will contain strings, ints, floats, bools, etc in unknown columns - so the dtypes aren't really known, it's whatever `read_csv` decides — user6118986, Jul 28 '23 at 00:04
A single column is always a single datatype. You can't mix booleans and ints within a single column. — Tim Roberts, Jul 28 '23 at 00:14
Dataframes _do_ have fixed dtypes. What are the dtypes of the dataframes you're working with? — Brian61354270, Jul 28 '23 at 00:14
Perhaps you should show us an example of the data you're using. — Tim Roberts, Jul 28 '23 at 00:17
Just as a note, sometimes a data cleaning step is useful - i.e., converting `0,1,'yes','no','true','false','off','on','oui','non'` and all your other mixed up values to clean booleans. — topsail, Jul 28 '23 at 00:20
@Brian61354270 I don't know the OP's data, but you can easily create this with `df = pd.DataFrame({'col': [True, 1, False, 0]})`. The dtype is `dtype('O')`. — Barmar, Jul 28 '23 at 00:44
If you print the df, it shows `True` and `1` in the first two rows. But `df['col'] == 1` is True in both rows. — Barmar, Jul 28 '23 at 00:47
@OCa [Please don't change code functionality/conventions in the question.](//meta.stackoverflow.com/q/260245/4518341) — wjandrea, Jul 28 '23 at 16:14
Are you talking about bool and int in one column, like Barmar showed, or some bool columns, some int columns? An example would help a lot; check out [How to make good reproducible pandas examples](/q/20109391/4518341). — wjandrea, Jul 28 '23 at 16:53
got your answer now i believe. Interesting question! I suppose the downvoting happened because you failed to add an input dataframe and a desired output. — OCa, Jul 29 '23 at 11:25

OCa · Answer 1 · 2023-07-31T22:15:33.893

Pre-processing your data to avoid collisions between varied datatypes is better practice. But assuming you cannot separate integers from booleans in your dataframes, then enhance == with boolean detection:

def BoolProofCompare(a, b):
    '''Override default True == 1, False == 0 behavior'''
    return a==b and isinstance(a, bool)==isinstance(b, bool)

BoolProofCompare(1, True)  # False
BoolProofCompare(0, False)  # False
BoolProofCompare(1, 1)  # True
BoolProofCompare(False, False)  # True
# and so on and so forth

Now, I gather that what you request is cell by cell comparison of a single value, e.g. df_2[0][0], with each element in a dataframe, e.g. df_1, with True==1 and False==0 equalities disabled. In that case, use applymap to broadcast the above comparison to every cell:

# my example of input dataframe
df
    col1  col2
0   True     1
1      1     2
2  False     3
3      0     4

df.applymap(lambda x : BoolProofCompare(x, True))
    col1   col2
0   True  False
1  False  False
2  False  False
3  False  False

df.applymap(lambda x : BoolProofCompare(x, False))
    col1   col2
0  False  False
1  False  False
2   True  False
3  False  False

df.applymap(lambda x : BoolProofCompare(x, 1))
    col1   col2
0  False   True
1   True  False
2  False  False
3  False  False

df.applymap(lambda x : BoolProofCompare(x, 0))
    col1   col2
0  False  False
1  False  False
2  False  False
3   True  False

I suppose it would be more convenient to encapsulate the enhanced comparison inside a new function, like this:

def BoolProofCompare_df(df, a):
    '''
    Compare single value *a* with dataframe *df*, cell by cell, 
    with True==1 and False==0 equalities disabled.
    '''
    return df.applymap(lambda x : BoolProofCompare(x, a))

score 0 · Answer 2 · answered Jul 29 '23 at 12:01

See @OCa's answer for the BoolProofCompare function. An alternative implementation which also makes 0 (int) different from 0.0 (float):

def BoolProofCompare(a, b):
    return a == b and type(a) == type(b)

The reason why just return a == b doesn't work is that in Python True == 1 and True == 1.0.

Override default True == 1, False == 0 behavior

2 Answers2