I have dataframes that can contain a mix of booleans and integers, and I'd like to be able to do things like df_1 == df_2.loc[0,0]
, and guarantee that if df_2.loc[0,0]
is 1 that it won't match True
values in df_1
.

- 298
- 2
- 13

- 341
- 2
- 15
-
1Could you elaborate on what you mean by "I have dataframes that can contain a mix of booleans and integers"? What dtypes? object? – Brian61354270 Jul 28 '23 at 00:01
-
The dataframes are formed from csv data which will contain strings, ints, floats, bools, etc in unknown columns - so the dtypes aren't really known, it's whatever `read_csv` decides – user6118986 Jul 28 '23 at 00:04
-
1A single column is always a single datatype. You can't mix booleans and ints within a single column. – Tim Roberts Jul 28 '23 at 00:14
-
Dataframes _do_ have fixed dtypes. What are the dtypes of the dataframes you're working with? – Brian61354270 Jul 28 '23 at 00:14
-
2Perhaps you should show us an example of the data you're using. – Tim Roberts Jul 28 '23 at 00:17
-
2Just as a note, sometimes a data cleaning step is useful - i.e., converting `0,1,'yes','no','true','false','off','on','oui','non'` and all your other mixed up values to clean booleans. – topsail Jul 28 '23 at 00:20
-
3@Brian61354270 I don't know the OP's data, but you can easily create this with `df = pd.DataFrame({'col': [True, 1, False, 0]})`. The dtype is `dtype('O')`. – Barmar Jul 28 '23 at 00:44
-
1@TimRoberts See my above comment. – Barmar Jul 28 '23 at 00:46
-
1If you print the df, it shows `True` and `1` in the first two rows. But `df['col'] == 1` is True in both rows. – Barmar Jul 28 '23 at 00:47
-
1@OCa [Please don't change code functionality/conventions in the question.](//meta.stackoverflow.com/q/260245/4518341) – wjandrea Jul 28 '23 at 16:14
-
1Are you talking about bool and int in one column, like Barmar showed, or some bool columns, some int columns? An example would help a lot; check out [How to make good reproducible pandas examples](/q/20109391/4518341). – wjandrea Jul 28 '23 at 16:53
-
got your answer now i believe. Interesting question! I suppose the downvoting happened because you failed to add an input dataframe and a desired output. – OCa Jul 29 '23 at 11:25
2 Answers
Pre-processing your data to avoid collisions between varied datatypes is better practice. But assuming you cannot separate integers from booleans in your dataframes, then enhance ==
with boolean detection:
def BoolProofCompare(a, b):
'''Override default True == 1, False == 0 behavior'''
return a==b and isinstance(a, bool)==isinstance(b, bool)
BoolProofCompare(1, True) # False
BoolProofCompare(0, False) # False
BoolProofCompare(1, 1) # True
BoolProofCompare(False, False) # True
# and so on and so forth
Now, I gather that what you request is cell by cell comparison of a single value, e.g. df_2[0][0]
, with each element in a dataframe, e.g. df_1
, with True==1
and False==0
equalities disabled. In that case, use applymap to broadcast the above comparison to every cell:
# my example of input dataframe
df
col1 col2
0 True 1
1 1 2
2 False 3
3 0 4
df.applymap(lambda x : BoolProofCompare(x, True))
col1 col2
0 True False
1 False False
2 False False
3 False False
df.applymap(lambda x : BoolProofCompare(x, False))
col1 col2
0 False False
1 False False
2 True False
3 False False
df.applymap(lambda x : BoolProofCompare(x, 1))
col1 col2
0 False True
1 True False
2 False False
3 False False
df.applymap(lambda x : BoolProofCompare(x, 0))
col1 col2
0 False False
1 False False
2 False False
3 True False
I suppose it would be more convenient to encapsulate the enhanced comparison inside a new function, like this:
def BoolProofCompare_df(df, a):
'''
Compare single value *a* with dataframe *df*, cell by cell,
with True==1 and False==0 equalities disabled.
'''
return df.applymap(lambda x : BoolProofCompare(x, a))

- 298
- 2
- 13
See @OCa's answer for the BoolProofCompare
function. An alternative implementation which also makes 0
(int) different from 0.0
(float):
def BoolProofCompare(a, b):
return a == b and type(a) == type(b)
The reason why just return a == b
doesn't work is that in Python True == 1
and True == 1.0
.

- 80,836
- 20
- 110
- 183