-1

I would like to create a brand new data frame by replacing values of a DF using a custom function. I keep getting the following error "ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()."

I tried some suggestions (Truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()) but it didn't work.

I would appreciate if somebody could shed light on this issue and help me out. Thank you in advance for your time.

def convert2integer(x):
    if x <= -0.5:
        return -1
    elif (x > -0.5 & x <= 0.5):
        return 0
    elif x > 0.5:
        return 1

df = pd.DataFrame({'A':[1,0,-0.6,-1,0.7],
       'B':[-1,1,-0.3,0.5,1]})

df.apply(convert2integer)
Henry Ecker
  • 34,399
  • 18
  • 41
  • 57
Amilovsky
  • 397
  • 6
  • 15

1 Answers1

2

A few options:

  1. The slower option but the most similar via applymap:
def convert2integer(x):
    if x <= -0.5:
        return -1
    elif x <= 0.5:
        return 0
    else:
        return 1


df = pd.DataFrame({'A': [1, 0, -0.6, -1, 0.7],
                   'B': [-1, 1, -0.3, 0.5, 1]})

new_df = df.applymap(convert2integer)

new_df:

   A  B
0  1 -1
1  0  1
2 -1  0
3 -1  0
4  1  1

applymap applies the function to each cell in the DataFrame. For this reason, x is a float, and should be treated as such.

  1. The faster option via np.select:
import numpy as np
import pandas as pd

df = pd.DataFrame({'A': [1, 0, -0.6, -1, 0.7],
                   'B': [-1, 1, -0.3, 0.5, 1]})

new_df = pd.DataFrame(
    np.select([df.le(-0.5), df.le(0.5)],
              [-1, 0],
              default=1),
    index=df.index,
    columns=df.columns
)

new_df:

   A  B
0  1 -1
1  0  1
2 -1  0
3 -1  0
4  1  1

np.select takes a list of conditions, and a list of choices. When the condition is True it uses the values from the corresponding index in the choice list.

The last condition does not need checked as if it did not match the first two conditions it must be greater than 0.5. Likewise, the second condition does not need to also check that it is greater than -0.5 because if it were the first condition would have been met.

Henry Ecker
  • 34,399
  • 18
  • 41
  • 57