How can I map True/False to 1/0 in a Pandas DataFrame?

Question

I have a column in python pandas DataFrame that has boolean True/False values, but for further calculations I need 1/0 representation. Is there a quick pandas/numpy way to do that?

To parrot @JonClements, why do you need to convert bool to int to use in calculation? bool works with arithmetic directly (since it is internally an int). — cs95, Jul 14 '20 at 02:09
@cs95 - Pandas uses numpy bools internally, and they can behave a little differently. In plain Python, True + True = 2, but in Pandas, numpy.bool_(True) + numpy.bool_(True) = True, which may not be the desired behavior on your particular calculation. — sql_knievel, Jan 19 '22 at 19:57
I needed it because statsmodels would not allow boolean data for logistic regression. — Peter B, Aug 18 '22 at 02:12

score 533 · Accepted Answer · edited Apr 27 '20 at 21:17

533

A succinct way to convert a single column of boolean values to a column of integers 1 or 0:

df["somecolumn"] = df["somecolumn"].astype(int)

edited Apr 27 '20 at 21:17

AMC

2,642
7
13
35

answered Dec 08 '14 at 16:36

User

62,498
72
186
247

36

The corner case is if there are NaN values in `somecolumn`. Using `astype(int)` will then fail. Another approach, which converts `True` to 1.0 and `False` to 0.0 (floats) while preserving NaN-values is to do: `df.somecolumn = df.somecolumn.replace({True: 1, False: 0})` – DustByte Jan 10 '20 at 11:29
@DustByte Good catch! – Homunculus Reticulli Apr 14 '20 at 13:49
1

@DustByte Couldn't you just use `astype(float)` and get the same result? – AMC Apr 27 '20 at 21:29
if the value is text and a lowercase "true" or "false" then first do a astype(bool].astype(int) and the conversion will work. Sas outputs is bools as lowercase true and false. – Golden Lion Sep 29 '20 at 11:03
how can this be applied to a number of columns? – unaied Mar 29 '21 at 10:41
Thank you. Should I do this to all columns or there is a command without specifying column name? – Avv Jul 03 '21 at 03:04
You can also consider nullable integer types instead of `int`, like `"Int64"` or `"Int8"` (note the uppercase) – C. Yduqoli Jul 10 '23 at 08:15

score 93 · Answer 2 · answered Jun 05 '16 at 21:54

93

Just multiply your Dataframe by 1 (int)

[1]: data = pd.DataFrame([[True, False, True], [False, False, True]])
[2]: print data
          0      1     2
     0   True  False  True
     1   False False  True

[3]: print data*1
         0  1  2
     0   1  0  1
     1   0  0  1

answered Jun 05 '16 at 21:54

shubhamgoel27

1,391
10
17

1

What are the advantages of this solution? – AMC Apr 27 '20 at 23:56
7

@AMC There are none, it's a hacky way to do it. – Phillip Copley Nov 17 '20 at 21:54
2

@AMC if your dataframe has `float` types beside booleans this method won't ruin them, `df.astype(int)` does. And since it's hacky it's probably a good idea to make intention clear with comment like `# bool -> int`. – Dmitriy Work Feb 17 '21 at 18:42
2

There is an advantage of using `data * 1` against `data + 0` with mixed types – it works on strings as well, where `data + 0` throws an error. Equivalent performance-wise. – Dmitriy Work Feb 17 '21 at 18:56
advantage: slightly shorter – qwr Oct 17 '21 at 23:44

Gareth Latty · Answer 3 · 2013-06-29T18:04:21.843

49

True is 1 in Python, and likewise False is 0^*:

>>> True == 1
True
>>> False == 0
True

You should be able to perform any operations you want on them by just treating them as though they were numbers, as they are numbers:

>>> issubclass(bool, int)
True
>>> True * 5
5

So to answer your question, no work necessary - you already have what you are looking for.

^{* Note I use is as an English word, not the Python keyword is - True will not be the same object as any random 1.}

edited Jun 29 '13 at 18:04

answered Jun 29 '13 at 17:58

Gareth Latty

86,389
17
178
183

2

Just be careful with data types if doing floating point math: `np.sin(True).dtype` is float16 for me. – jorgeca Jun 29 '13 at 18:09
9

I've got a dataframe with a boolean column, and I can call `df.my_column.mean()` just fine (as you imply), but when I try: `df.groupby("some_other_column").agg({"my_column":"mean"})` I get `DataError: No numeric types to aggregate`, so it appears they are **NOT** always the same. Just FYI. – dwanderson Dec 15 '16 at 21:10
In pandas version 24 (and maybe earlier) you can aggregate `bool` columns just fine. – BallpointBen Feb 11 '19 at 22:09
1

It looks like numpy also throws errors with boolean types: `TypeError: numpy boolean subtract, the `-` operator, is deprecated, use the bitwise_xor, the `^` operator, or the logical_xor function instead.` Using @User's answer fixes this. – Amadou Kone Mar 13 '19 at 16:01
1

Another reason it's not the same: df.col1 + df.col2 + df.col3 doesn't work for `bool` columns as it does for `int` columns – colorlace May 24 '19 at 21:55

score 48 · Answer 4 · answered Dec 21 '20 at 15:30

This question specifically mentions a single column, so the currently accepted answer works. However, it doesn't generalize to multiple columns. For those interested in a general solution, use the following:

df.replace({False: 0, True: 1}, inplace=True)

This works for a DataFrame that contains columns of many different types, regardless of how many are boolean.

score 23 · Answer 5 · answered Jun 29 '13 at 18:17

You also can do this directly on Frames

In [104]: df = DataFrame(dict(A = True, B = False),index=range(3))

In [105]: df
Out[105]: 
      A      B
0  True  False
1  True  False
2  True  False

In [106]: df.dtypes
Out[106]: 
A    bool
B    bool
dtype: object

In [107]: df.astype(int)
Out[107]: 
   A  B
0  1  0
1  1  0
2  1  0

In [108]: df.astype(int).dtypes
Out[108]: 
A    int64
B    int64
dtype: object

score 4 · Answer 6 · edited Apr 27 '20 at 21:19

4

Use Series.view for convert boolean to integers:

df["somecolumn"] = df["somecolumn"].view('i1')

edited Apr 27 '20 at 21:19

AMC

2,642
7
13
35

answered Apr 13 '20 at 12:41

jezrael

822,522
95
1,334
1,252

score 2 · Answer 7 · answered Dec 24 '19 at 20:27

2

You can use a transformation for your data frame:

df = pd.DataFrame(my_data condition)

transforming True/False in 1/0

df = df*1

answered Dec 24 '19 at 20:27

Bruno Benevides

37
2

1

This is identical to [this solution](https://stackoverflow.com/a/37647160/11301900), posted 3 years earlier. – AMC Apr 27 '20 at 23:59

score 2 · Answer 8 · answered Sep 29 '20 at 21:39

2

I had to map FAKE/REAL to 0/1 but couldn't find proper answer.

Please find below how to map column name 'type' which has values FAKE/REAL to 0/1
(Note: similar can be applied to any column name and values)

df.loc[df['type'] == 'FAKE', 'type'] = 0
df.loc[df['type'] == 'REAL', 'type'] = 1

answered Sep 29 '20 at 21:39

kaishu

59
1
5

2

Much simpler: `df['type'] = df['type'].map({'REAL': 1, 'FAKE': 0})`. In any case, I'm not sure it's too relevant to this question. – AMC Nov 18 '20 at 01:29
Thanks for providing simpler solution. As I mentioned in answer, I was trying to find solution for slightly different question, and only similar questions like this were available. Hope my answer and your solution will help someone in future. – kaishu Nov 26 '20 at 15:59
There are other questions which already cover that, though, like https://stackoverflow.com/q/20250771. – AMC Nov 26 '20 at 21:27

YScharf · Answer 9 · 2023-04-13T11:25:53.570

2

Tried and tested:

df[col] = df[col].map({'True': 1,'False' :0 })

If there are more than one columns with True/False, use the following.

for col in bool_cols:
    df[col] = df[col].map({'True': 1,'False' :0 })

@AMC wrote this in a comment

edited Apr 13 '23 at 11:25

answered Feb 13 '23 at 14:41

YScharf

1,638
15
20

score 1 · Answer 10 · edited Mar 01 '23 at 16:15

1

If the column is of the type object, and for example you want to convert it to integer:

df["somecolumn"] = df["somecolumn"].astype(bool).astype(int)

edited Mar 01 '23 at 16:15

Ali

2,228
1
21
21

answered Feb 20 '23 at 17:37

sanjayy27

11
1

Make sure to put code in code blocks. – Blue Robin Feb 24 '23 at 02:00

score 0 · Answer 11 · answered Jan 17 '22 at 14:35

This is a reproducible example based on some of the existing answers:

import pandas as pd


def bool_to_int(s: pd.Series) -> pd.Series:
    """Convert the boolean to binary representation, maintain NaN values."""
    return s.replace({True: 1, False: 0})


# generate a random dataframe
df = pd.DataFrame({"a": range(10), "b": range(10, 0, -1)}).assign(
    a_bool=lambda df: df["a"] > 5,
    b_bool=lambda df: df["b"] % 2 == 0,
)

# select all bool columns (or specify which cols to use)
bool_cols = [c for c, d in df.dtypes.items() if d == "bool"]

# apply the new coding to a new dataframe (or can replace the existing one)
df_new = df.assign(**{c: lambda df: df[c].pipe(bool_to_int) for c in bool_cols})

score 0 · Answer 12 · answered Apr 13 '23 at 11:38

Most efficient way to convert True/False values to 1/0 in a Pandas DataFrame is to use the pd.Series.view() method. This method creates a new NumPy array that shares the memory with the original DataFrame column, but with a different data type. Here's an example:

import pandas as pd

# create a sample DataFrame with True/False values
df = pd.DataFrame({'A': [True, False, True], 'B': [False, True, False]})

# convert True/False values to 1/0 using view()
df['A'] = df['A'].view('i1')
df['B'] = df['B'].view('i1')

# print the resulting DataFrame
print(df)

score 0 · Answer 13 · answered Aug 21 '23 at 18:54

0

True % (an odd number) = 1 False % (an odd number) = 0

answered Aug 21 '23 at 18:54

J Rwar

1

Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Aug 24 '23 at 14:42

How can I map True/False to 1/0 in a Pandas DataFrame?

13 Answers13

transforming True/False in 1/0

Linked

Related