-2

I have a problem with the if statement when I want to add a new column.

import pandas as pd
scan = pd.DataFrame([[1,2,3],['a','b','c']], columns=['st','nd','rd'])
scan['th'] = 0 if scan['st'] == 0 else 1

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

How can I fix this?

wjandrea
  • 28,235
  • 9
  • 60
  • 81
smet for
  • 11
  • 4
  • 2
    Does this answer your question? [Truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()](https://stackoverflow.com/questions/36921951/truth-value-of-a-series-is-ambiguous-use-a-empty-a-bool-a-item-a-any-o) – vijaysharma Jan 07 '23 at 19:37
  • How do you *want* to fix it? Like, do you want to make *each* value of `th` dependent on the respective value at `st`, or do you want to make *all* values of `th` dependent on an aggregate of `st`, like `.all()`? (In other words, assign a vector or a scalar?) – wjandrea Jan 07 '23 at 19:39
  • In your own words, where the code says `scan['st'] == 0`, what do you expect this to mean? For the given value of `scan`, what do you think the result will be? Similarly: what do you think it will mean, to do `scan['th'] = 0`, or `scan['th'] = 1`? Were you hoping that it would automatically **iterate over rows**, check the value for the `st` column in each row, and set the corresponding value in the `th` column? Pandas and Numpy are not **that** magical; they are still bound by Python's language grammar. – Karl Knechtel Jan 07 '23 at 19:40
  • Does [Create new column based on values from other columns / apply a function of multiple columns, row-wise in Pandas](/questions/26886653) answer your question? – Karl Knechtel Jan 07 '23 at 19:41

1 Answers1

0

Use numpy.where to apply those conditions and values to the whole array:

scan['th'] = np.where(scan['st'] == 0, 0, 1)

It might be interesting to benchmark the former approach against this one:

scan['th'] = (scan['st'] != 0).astype(int)
Guimoute
  • 4,407
  • 3
  • 12
  • 28