1

In the data below, I need to add an extra columns based on certain comparisons.

test_file.csv

day v1  v2  v3
mon 38  42  42
tue 45  35  43
wed 36  45  43
thu 41  35  45
fri 37  42  44
sat 40  43  42
sun 43  40  43

I have tried these lines of code, and it throws the error shown just below the code.

df["Compare_col_1"] = ""
df["Compare_col_2"] = ""

if ((df.v3 < df.v1) & (df.v2 > df.v1)):
    df["Compare_col_1"] = "Balanced"
else:
    df["Compare_col_1"] = "Out_of_Bounds"


if df.v3 < df.v2:
    df["Compare_col_2"] = "Eligible"
else:
    df["Compare_col_2"] = "Slow"

Error(Using Pandas only)

Traceback (most recent call last):
  File "C:\Trials\Test.py", line 291, in 
    if ((df.v3  df.v1)):
  File "C:\Winpy\WPy64-3770\python-3.7.7.amd64\lib\site-packages\pandas\core\generic.py", line 1479, in __nonzero__
    f"The truth value of a {type(self).__name__} is ambiguous. "
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Now, I have seen several articles like this one, giving excellent explanation on how to use numpy for the results I need. But the same error repeats as shown below..

New Code(with numpy):

if (np.logical_and((df.SMA_8d < df.ClosePrice) , (df.ClosePrice < df.SMA_3d))):
    df["Mark2"] = "True"
else:
    df["Mark2"] = "False"
Traceback (most recent call last):
  File "C:\Trials\Test.py", line 291, in 
    if (np.logical_and((df.v3  df.v1))):
  File "C:\Winpy\WPy64-3770\python-3.7.7.amd64\lib\site-packages\pandas\core\generic.py", line 1479, in __nonzero__
    f"The truth value of a {type(self).__name__} is ambiguous. "
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Is there any solution possible for generting those new columns, by comparing adjacent columns (and more importantly, a solution in pandas only...)

Ben.T
  • 29,160
  • 6
  • 32
  • 54
Lokkii9
  • 75
  • 1
  • 15

1 Answers1

3

you can use np.where like:

df["Compare_col_1"] = np.where((df.v3<df.v1)&(df.v2>df.v1), "Balanced", "Out_of_Bounds")
df["Compare_col_2"] = np.where(df.v3<df.v2, "Eligible", "Slow")
Ben.T
  • 29,160
  • 6
  • 32
  • 54
  • Dear Ben, thank you so much for the answer.. And I was wondering if you can pls also help with one element more which I was about to add to question. To check with previous day data using the .shift() function.. like for example, if previous day 'Compare_col_1' is Balanced, then add something to a new column 'Compare_col_3'.. .. Kindly guide me on this as well by adding an extra line pleasee .. thank you so much – Lokkii9 May 09 '20 at 17:59
  • 1
    @Lokkii9 something like `df['Compare_col_3'] = np.where(df['Compare_col_1'].shift().eq('Balanced'), 'yesterday_balanced', 'yesterday_out')`? – Ben.T May 09 '20 at 18:03