4

I have a pandas column with values ranging from 0.0 to 1.0.

I want to convert this column to a binary column (0 or 1) based on a threshold, i.e. if the value is <= threshold it will become 0 and 1 otherwise.

mommomonthewind
  • 4,390
  • 11
  • 46
  • 74

3 Answers3

5

Create boolean mask by gt (>) and then convert it to integers:

df = pd.DataFrame({'col':[.4,0.5,.1]})

threshold = .2
df['new'] = df['col'].gt(threshold).astype(int)
print (df)
   col  new
0  0.4    1
1  0.5    1
2  0.1    0
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
2
df.column = df.column > threshold
df.column.astype(int)
grshankar
  • 417
  • 2
  • 14
-2

I would create a helper column and then iterate through the rows and setting the value for each cell. Something like this:

import pandas as pd
import numpy as np
a = np.random.random_sample(5)
df = pd.DataFrame({"A": a})
df["Helper"] = ""
for i in range(len(df)):
    if df.loc[i,"A"] <= 0.5:
        df.loc[i,"Helper"] = 0
    else:
        df.loc[i,"Helper"] = 1

Which leads to this:

          A  Helper
0  0.114089       0
1  0.309759       0
2  0.158169       0
3  0.444199       0
4  0.645443       1
OD1995
  • 1,647
  • 4
  • 22
  • 52