15

When I try to convert some columns in a pandas dataframe from '0' and '1' to 'FALSE' and 'TRUE', pandas automatically detects dtype as boolean. I want to keep dtype as string, with the strings 'TRUE' and 'FALSE'.

booleanColumns = pandasDF.select_dtypes(include=[bool]).columns.values.tolist()
booleanDictionary = {'1': 'TRUE', '0': 'FALSE'}

pandasDF.to_string(columns = booleanColumns)

for column in booleanColumns:
    pandasDF[column].map(booleanDictionary)

Unfortunately, python automatically converts dtype to boolean with the last operation. How can I prevent this?

cottontail
  • 10,268
  • 18
  • 50
  • 51
Dendrobates
  • 3,294
  • 6
  • 37
  • 56

2 Answers2

31

If need replace boolean values True and False:

booleandf = pandasDF.select_dtypes(include=[bool])
booleanDictionary = {True: 'TRUE', False: 'FALSE'}

for column in booleandf:
    pandasDF[column] = pandasDF[column].map(booleanDictionary)

Sample:

pandasDF = pd.DataFrame({'A':[True,False,True],
                   'B':[4,5,6],
                   'C':[False,True,False]})

print (pandasDF)
       A  B      C
0   True  4  False
1  False  5   True
2   True  6  False

booleandf = pandasDF.select_dtypes(include=[bool])
booleanDictionary = {True: 'TRUE', False: 'FALSE'}

#loop by df is loop by columns, same as for column in booleandf.columns:
for column in booleandf:
    pandasDF[column] = pandasDF[column].map(booleanDictionary)

print (pandasDF)
       A  B      C
0   TRUE  4  FALSE
1  FALSE  5   TRUE
2   TRUE  6  FALSE

EDIT:

Simplier solution with replace by dict:

booleanDictionary = {True: 'TRUE', False: 'FALSE'}
pandasDF = pandasDF.replace(booleanDictionary)
print (pandasDF)
       A  B      C
0   TRUE  4  FALSE
1  FALSE  5   TRUE
2   TRUE  6  FALSE
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • Just beware with Simpler solution with replace - applying .replace() to int column will result replacing 0 and 1 to boolean values :( – Kucera.Jan.CZ Dec 01 '21 at 10:54
0

You can replace values in multiple columns in a single replace call.

mapping = {'1': 'TRUE', '0': 'FALSE'}
df[['A','B']] = df[['A','B']].replace(mapping)

If you're changing boolean columns into 'TRUE', 'FALSE' strings, then no need to replace, just change dtype.

df[['A', 'B']] = df[['A','B']].astype(str).apply(lambda x: x.str.upper())
cottontail
  • 10,268
  • 18
  • 50
  • 51