1

I am a newbie to Python. I am trying to create a new variable based on two columns in the dataset.

def cal_freqw(var1, var2):
    if var1 == 1:
        return 0
    elif (var1 == 2 and var2 < 998):
        return 7*var2
    elif var == 3:
        return var2/31;
    elif var1 == 98:
        return 9998
    elif var2 == 998:
        return 9998

df["FREQW"]=cal_freqw(df["UNIT"], df["NUM"])

I get this error message:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
 

df["UNIT"] and df["NUM"] are integers.(Correction: series of integers)

Because I have several variables like "UNIT" and "NUM" to be computed, a function would help. Could someone help me to fix my calling command? Thank you!

After editing my code suggested by Zichzheng and others, I still receive this following error message:


      2
      3 def cal_freqw(var1, var2):
----> 4     if var1 == 1:
      5         return 0
      6     elif (var1 == 2 & var2 < 998):
 

   1477               1     2
   1478         0  10.0  20.0
-> 1479         >>> df.equals(different_data_type)
   1480         False
   1481         """

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Thank you for your help!

Treepmunk
  • 11
  • 3
  • 2
    `df["UNIT"] and df["NUM"] are integers` no they are `series of integers` so there is no meaning of `var1 == 1` either `all` series element `==1` or atleast one `any` is `==1` – Epsi95 Jul 21 '21 at 03:25
  • Does this answer your question? [Truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()](https://stackoverflow.com/questions/36921951/truth-value-of-a-series-is-ambiguous-use-a-empty-a-bool-a-item-a-any-o) – Henry Ecker Jul 21 '21 at 05:06

1 Answers1

1

The or and and statements need truth-values. And in pandas, they are ambiguous. So You need to use |(OR) and &(AND) to replace them.

So, your code will be:

def cal_freqw(var1, var2):
    if var1 == 1:
        return 0
    elif (var1 = 2) & (var2 < 998):
        return 7*var2
    elif var == 3:
        return var2/31;
    elif var1 == 98:
        return 9998
    elif var2 == 998:
        return 9998

df["FREQW"]=cal_freqw(df["UNIT"], df["NUM"])

See this document if you want to know more about it.

Zichzheng
  • 1,090
  • 7
  • 25
  • Thank you all very much for the replies. Now I understand that I need to use **&**. I followed Zichzheng's code, but somehow it shows me the same error message as before. I will update the error message in my question. – Treepmunk Jul 21 '21 at 20:26
  • I think the condition also needs to be inside a (). I just updated my answer. Try: (var1 == 2) & (var2 < 998) – Zichzheng Jul 21 '21 at 20:44
  • Thank you very much, Zichzheng. I just tried including the parentheses. It gives me the same error messages. – Treepmunk Jul 21 '21 at 20:56
  • Can you do a print(df["UNIT"]) and print(df["NUM"])? I would like to know the exact value. – Zichzheng Jul 21 '21 at 21:02
  • I am only showing the first 5 observations. Otherwise, it'd be too much. Thank you for helping me to troubleshoot! (Index removed) (UNIT: 98 3 4 3 3)(NUM: 998 3 2 5 20) – Treepmunk Jul 22 '21 at 13:15
  • Restricting the function with one variable only still gives the same error message. Changing the series of integers to series of floats also doesn't help. – Treepmunk Jul 23 '21 at 14:41
  • Try use " = " instead of '==' – Zichzheng Jul 23 '21 at 16:32
  • (if comparing using " = ", " != ") in parenthesis – Zichzheng Jul 23 '21 at 16:34
  • And if your df have null values, use 0 to replace them. – Zichzheng Jul 23 '21 at 16:38
  • Thank you very much for your help. Zichzheng. I tried "=", but didn't work, and there's no missing values. Not sure what's the problem... – Treepmunk Jul 25 '21 at 04:17