5

I have a dataframe that contains one hot encoded columns of 0s and 1s which is of dtype int32.

a     b      h1      h2     h3
xy    za      0       0      1
ab    cd      1       0      0
pq    rs      0       1      0  

I want to convert the columns h1,h2 and h3 to boolean so here is what I did..

df[df.columns[2:]].astype(bool)

But this changed all values of h1-h3 as TRUE.

I also tried

df[df.columns[2:]].map({0:False, 1:True})

but that does not work either. (AttributeError: 'DataFrame' object has no attribute 'map')

What is the best way to convert specific columns of the dataframe from int32 0s and 1s to boolean (True/False)?

Red
  • 26,798
  • 7
  • 36
  • 58
Devarshi Goswami
  • 1,035
  • 4
  • 11
  • 26

3 Answers3

7

You can select all columns by positions after first 2 with DataFrame.iloc, convert to boolean and assign back:

df.iloc[:, 2:] = df.iloc[:, 2:].astype(bool)
print (df)
    a   b     h1     h2     h3
0  xy  za  False  False   True
1  ab  cd   True  False  False
2  pq  rs  False   True  False

Or create dictionary for convert columns names without first 2:

df = df.astype(dict.fromkeys(df.columns[2:], bool))
print (df)
    a   b     h1     h2     h3
0  xy  za  False  False   True
1  ab  cd   True  False  False
2  pq  rs  False   True  False
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
3

You were quite close with your second try. Try this

df[df.columns[2:]].applymap(bool)

Arpan
  • 124
  • 6
3

There is actually another option, not the most Pythonic. Nevertheless I'll provide it, can be useful if you want to convert strings (e.g.: 'Cat' versus 'Dog') to Boolean (False, True) in one step:

df = pd.DataFrame({'a':['xy','ab','pq'], 'b':['za','cd','rs'], 'h1':[0,1,0], 'h2':[0,0,1], 'h3':[1,0,0]})

df = df.replace({0:False, 1:True})

Checking for data type:

df.dtypes

a     object
b     object
h1      bool
h2      bool
h3      bool
dtype: object
Ruthger Righart
  • 4,799
  • 2
  • 28
  • 33